**About the Editor**

**Roland Ulber** Born on 20.06.1963. Full Professor of Microbiology at the Department of Pharmacy and Biotechnology at the University of Bologna, Italy. He conducts research in the field of virology with a special interest in parvovirus B19.

## **Preface to "Advances in Parvovirus Research 2020"**

The Special Issue of *Viruses* 'Advances in Parvovirus Research 2020' features a series of articles collected in 2020 and 2021. The issue was conceived in continuation of the 2019 Special Issue 'New Insights in Parvovirus Research', which successfully attracted the interest of scientists actively involved in research on the topic of paroviruses. The contributions collectively cover many aspects of basic and translational research on viruses of the family *Parvoviridae*, which are incredibly diverse in their biology and have significant relevance as human and veterinary pathogens, as tools for oncolytic therapy, or as sophisticated gene delivery vectors.

Structural biology is the subject of articles contributed by Mietszch et al. and Chun Yu et al., from the group of late Mavis Agbandje-McKenna, a leading scientist whose achievements will exert a long-lasting influence on her colleagues. Mattola et al. provide a review of current knowledge and advanced experimental techniques for studying the initial phases of parvovirus replicative cycles. Ros et al. summarize known aspects, open issues, and future perspectives related to the minor capsid protein of parvovirus B19. Ferreira et al. report on the uptake mechanism of the oncolytic parvovirus H-1, while Boftsi et al. provide new mechanistic details of post-transcriptional processing in a Minute Virus of Mice virus model. Parvovirus B19 is the subject of contributions from Bua et al. and Ducloux et al., both of which address issues with translational implications for a pathogenic human virus. Other contributions are focused on viruses of veterinary interest, including the review on perspectives related to Aleutian mink disease virus by Markarian et al. and the article on the construction of viral-like particles of tiger feline parvovirus by Jiao et al. Investigations into the molecular phylogenetics of canine parvovirus were the subject of further contributions by Kelman et al., Giraldo-Ramirez et al., and Gainor et al., while Horecka et al. report on the shift in epidemiology connected to the first wave of the COVID-19 pandemic. Finally, Mijanovic et al. present an overview of adeno-associated virus-derived vectors for gene therapy of neurodegenerative diseases, an area of impressive achievements and critical ethical debate.

In the years these papers were submitted, the concomitance of the most critical phases of COVID-19 pandemic hindered normal research activity and diverted relevant energies and resources. Searching for the term 'Parvoviridae' in PubMed reveals the emergence of an inversion in the increasing trend in publications, with an around 30% decrease in published papers on the subject, a fate common to numerous other subject topics in face of the exponential accumulation of works on SARS-CoV2. It is foreseeable and hoped that in the near future, research on parvoviruses will resume its pace and progress toward new knowledge and advanced translational applications.

> **Giorgio Gallinella** *Editor*

## *Article* **Completion of the AAV Structural Atlas: Serotype Capsid Structures Reveals Clade-Specific Features**

**Mario Mietzsch 1, Ariana Jose 1, Paul Chipman 1, Nilakshee Bhattacharya 2,†, Nadia Daneshparvar 2, Robert McKenna <sup>1</sup> and Mavis Agbandje-McKenna 1,\***


**Abstract:** The capsid structures of most Adeno-associated virus (AAV) serotypes, already assigned to an antigenic clade, have been previously determined. This study reports the remaining capsid structures of AAV7, AAV11, AAV12, and AAV13 determined by cryo-electron microscopy and threedimensional image reconstruction to 2.96, 2.86, 2.54, and 2.76 Å resolution, respectively. These structures complete the structural atlas of the AAV serotype capsids. AAV7 represents the first clade D capsid structure; AAV11 and AAV12 are of a currently unassigned clade that would include AAV4; and AAV13 represents the first AAV2-AAV3 hybrid clade C capsid structure. These newly determined capsid structures all exhibit the AAV capsid features including 5-fold channels, 3-fold protrusions, 2-fold depressions, and a nucleotide binding pocket with an ordered nucleotide in genome-containing capsids. However, these structures have viral proteins that display clade-specific loop conformations. This structural characterization completes our three-dimensional library of the current AAV serotypes to provide an atlas of surface loop configurations compatible with capsid assembly and amenable for future vector engineering efforts. Derived vectors could improve gene delivery success with respect to specific tissue targeting, transduction efficiency, antigenicity or receptor retargeting.

**Keywords:** AAV; serotype; capsid; cryo-EM; genome packaging; gene delivery

## **1. Introduction**

Adeno-associated viruses (AAV) are single-stranded DNA packaging viruses of the *Parvoviridae* and belong to the genus *Dependoparvovirus* [1]. Vectors based on AAVs are being developed and used as gene delivery biologics to treat a large variety of monogenetic diseases [2]. Thirteen human and primate AAV serotypes, and numerous genomic isolates have been described and have been assigned to six clades A–F or individual clonal isolates [3]. The virions of the AAVs are composed of non-enveloped capsids with T = 1 icosahedral symmetry and diameters of ≈260 Å [4]. They are assembled from 60 viral proteins (VPs): VP1 (≈82 kDa), VP2 (≈73 kDa), and VP3 (≈61 kDa) in an approximate 1:1:10 ratio [5]. The VPs share a common C-terminus that includes the entirety of VP3. Compared to VP3, VP1 and VP2 are extended at their N-termini with a shared ≈65 amino acid (aa) region and additional ≈137 aa N-terminal to VP2 in the case of VP1 (VP1u). The N-terminal regions of VP1 and VP2 contain conserved elements required for AAV infectivity such as a phospholipase A2 (PLA2) domain, a calcium-binding domain, and nuclear localization signals [6,7]. Overall, the VP1 amino acid sequence identity of the AAV serotypes varies between 57 and 99% [8].

**Citation:** Mietzsch, M.; Jose, A.; Chipman, P.; Bhattacharya, N.; Daneshparvar, N.; McKenna, R.; Agbandje-McKenna, M. Completion of the AAV Structural Atlas: Serotype Capsid Structures Reveals Clade-Specific Features. *Viruses* **2021**, *13*, 101. https://doi.org/10.3390/ v13010101

Academic Editor: Giorgio Gallinella Received: 22 December 2020 Accepted: 7 January 2021 Published: 13 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The capsid structures of several natural human and primate AAV serotypes, AAV1- AAV6, AAV8, AAV9, AAVhu.37, AAVrh.8, AAVrh.10, and AAVrh.39 have been determined by either X-ray crystallography and/or cryo-electron microscopy (cryo-EM) [9–19]. Regardless of the method of structure determination, only VP3 of the AAVs, except for the first ≈15 aa, are structurally ordered. The VP3 structure consists of an anti-parallel, eight-stranded (βB to βI) β-barrel motif, with the BIDG sheet forming the inner surface of the capsid. An additional strand, βA, runs anti-parallel to the βB strand. Furthermore, all AAVs conserve an α-helix (αA) located between βC and βD. Between the individual β-strands, large loops are inserted that are characterized by high sequence and structure variability among the AAVs. These loops form the exterior surface of the capsid and are named after their flanking β-strands. For example, the HI loop is flanked by the βH and βI strands. The sequence variability of different AAVs results in alternative conformations of these loops, which result in AAV serotype-specific capsid surface features. Nine regions of significant diversity at the apex of these loops have been defined as variable regions (VRs) by structural alignments [15]. Despite the structural differences of the VRs, the overall capsid morphology is conserved. These include cylindrical channels at the icosahedral 5-fold symmetry axes, formed by the DE-loops (VR-II), surrounded by a depression largely outlined by the HI-loops. The 5-fold channel is believed to be the route of genomic DNA packaging and VP1u externalization during endo/lysosomal trafficking following cell entry [20,21]. At the 2-fold symmetry axes, depressions are flanked by protrusions surrounding the 3-fold symmetry axes, and raised capsid regions between the 2- and 5-fold axes are termed 2/5-fold walls. The 3-fold region as well as the 2/5 fold wall have been identified as receptor binding sites for many AAV serotypes and serve as determinants of cell and tissue tropism. Among the cellular receptors are sialic acids [22–24], heparan sulfate proteoglycans (HSPG) [25–29], terminal galactose [30,31], sulfated N-acetyl-lactosamine [32], AAVR [33], laminin [34], αvβ1 integrin [35], αvβ5 integrin [36], the hepatocyte growth factor receptor [37], the fibroblast growth factor receptor [38], and platelet-derived growth factor receptor [39]. In addition to receptor binding, the surface of the capsid, including the 5-fold region, displays antigenic sites for antibodies raised by the host immune response [40].

In this study, the structures of the AAV7, AAV11, AAV12, and AAV13 capsids were determined by cryo-EM in an effort to complete the panel of available structures for the defined AAV serotypes. The empty and genome-containing capsid structures of these four AAV serotypes were reconstructed to be between 2.54 to 3.15 Å resolution. All density maps displayed well-defined amino acid side chain densities and showed the characteristic AAV capsid features, including the channels at the 5-fold axes, depressions at the 2-fold and surrounding the 5-fold axes, and protrusions that surround the 3-fold axes. The comparison of the empty (no DNA) and full (genome packaged) capsid structures showed no structural differences of the VP monomer except for an ordered nucleotide at the previously described nucleotide (nt) binding pocket in the case of the full capsids and alternative side chain orientations [17]. Compared to AAV2, significant structural differences were observed primarily at the 3-fold protrusions and the 2/5-fold wall due to aa insertions or deletions as well as sequence differences. This characterization of the structures of AAV7, AAV11, AAV12, and AAV13, completes the library for the defined serotypes. This provides a means to functionally annotate their capsids and a visual platform to aid recombinant DNA vector engineering for improved gene delivery applications.

## **2. Materials and Methods**

## *2.1. AAV Production and Purification*

The AAV7, AAV11, AAV12, and AAV13 producer plasmid *cap* genes were synthesized by GenArt (Thermo Fisher, Waltham, MA, USA) and subcloned into a plasmid with the AAV2 *rep* gene to generate pR2V7, pR2V11, pR2V12, and pR2V13, respectively. Recombinant AAV7, AAV11, AAV12, and AAV13 vectors, with a packaged luciferase gene, were produced by the triple transfection of HEK293 cells, utilizing pTR-UF3-Luciferase, pHelper

(Stratagene, San Diego, CA, USA), and either pR2V7, pR2V11, pR2V12, or pR2V13, and harvested 72 h post transfection as previously described [41]. The cleared lysates containing AAV7, AAV12, and AAV13 capsids were purified by AVB Sepharose and AAV11 by POROS Capture Select AAVX affinity chromatography as previously described [42]. Sample purity and capsid integrity were monitored by SDS-PAGE and negative- stain electron microscopy using a Spirit microscope (FEI, Hillsboro, OR, USA).

## *2.2. Cryo-Electron Microscopy Data Collection and 3D Image Reconstruction*

For each of the purified AAV capsids, 3.5 μL was applied to a glow-discharged Quantifoil copper grid with 2 nm continuous carbon support over holes (Quantifoil R 2/4 400 mesh), blotted, and vitrified using a Vitrobot Mark 4 (FEI, Hillsboro, OR, USA) at 95% humidity and 4 ◦C. The capsid distribution and ice quality of the grids were screened in-house using an FEI Tecnai G2 F20-TWIN microscope (FEI) operated under low-dose conditions (200 kV, ≈20e−/Å2). Images were collected on a GatanUltraScan 4000 CCD camera (Gatan, Pleasanton, CA, USA). Grids deemed suitable for high-resolution data collection were used for collecting micrograph movie frames using the Leginon application [43] on a Titan Krios electron microscope. The microscope was operated at 300 kV and data were collected on a Gatan K3 direct electron detector. During data collection, a total dose of ≈60 e<sup>−</sup>/Å2 was utilized for 45 to 71 movie frames per micrograph (Table 1). The movie frames were aligned using MotionCor2 with dose weighting [44]. All datasets were collected as part of the NIH "Southeastern Center for Microscopy of MacroMolecular Machines (SECM4)" project. For the three-dimensional image reconstruction, the cisTEM software package was utilized [45] and the data were processed as described previously [46]. The sharpened density maps were inspected using Coot and Chimera [47,48]. The −90 Å2/0 Å2 sharpened maps were utilized for assignment of the amino acid mainand side chains. The resolution of the cryo-reconstructed density maps for empty (no DNA) and genome-containing AAV7, AAV11, AAV12, and AAV13 capsids were estimated based on a Fourier Shell Correlation of 0.143 (Table 1).

#### *2.3. Model Building and Structure Refinement*

Three-dimensional (3D) homology models of AAV7, AAV11, AAV12, and AAV13 VP3 were generated with the protein structure homology-modeling server Swiss model (https://swissmodel.expasy.org) [49] using their amino acid sequences (NCBI accession numbers YP\_077178, AAT46339, ABI16639 and ABZ10812, respectively) and supplying the VP3 structures of AAV8 (PDB accession number: 2QA0) for AAV7, AAV4 (2G8G) for AAV11 and AAV12, and AAV3 (3KIC) for AAV13 as templates [9,14,15]. AT=1 60-mer capsid coordinate model was generated from the respective VP3 with the VIPERdb2 Oligomer generator subroutine by icosahedral matrix multiplication [50]. The 60-mer capsid models of each AAV were docked into their cryo-reconstructed density maps by rigid body rotations and translations using the 'fit in map' subroutine within UCSF-Chimera [48]. This application uses a correlation coefficient (CC) calculation to assess the quality of the fit between the map generated from the model and the reconstructed map. During the model fitting, the voxel (pixel) size of each reconstructed map was adjusted to optimize the CC between the models and maps. The fitted models were exported relative to the respective map for further use. Each map was resized to the voxel size determined in Chimera using the "e2proc3D.py" subroutine in EMAN2 [51] and then converted to the CCP4 format using the program MAPMAN [52]. A VP monomer was extracted from each 60-mer and the side- and main chains were adjusted into the maps by manual building and the real-space refinement subroutine in Coot [47]. The adjusted capsid model was refined against the map utilizing the rigid body, real space, and B-factor refinement subroutines in Phenix [53]. Capsid model refinement was alternated with visualization and adjustment of VP side- and main chains using Coot while maintaining model geometry as well as rotamer and Ramachandran constraints [47]. The CC and refinement statistics, including root mean

square deviations (RMSD) from ideal bond lengths and angles (Table 1), were analyzed using Phenix [53].


**Table 1.** Summary of data collection, image processing, and refinement statistics.

## *2.4. AAV Capsid Structure Comparison*

The Cα positions of the ordered amino acids within the VP3 atomic coordinates for each of the AAVs were superposed using secondary structure matching (SSM) in Coot [54]. This SSM subroutine also generates a list of the Cα–Cα distances between the aligned structures, which was used to calculate the overall root mean square deviation (RMSD). Deviations between non-overlapping Cα positions, because of residue deletions or insertions, were measured using the distance tool in Coot. Structural identity was determined using PDBeFold (https://www.ebi.ac.uk/msd-srv/ssm/) and calculated as the number of aligned residues (<1.0 Å apart) divided by the total number of residues. Amino acid sequence alignments of the different AAV serotypes were done utilizing the sequence alignment option in VectorNTI (Invitrogen, Carlsbad, CA, USA).

## *2.5. Structure Accession Numbers*

The full and empty AAV7, AAV11, AAV12, and AAV13 cryo-EM reconstructed density maps and models built for their capsids were deposited in the Electron Microscopy Data Bank (EMDB) with accession numbers EMD-23190/PDB ID 7L5U (AAV7 full), EMD-23189/PDB ID 7L5Q (AAV7 empty), EMD-23202/PDB ID 7L6E (AAV11 full), EMD-23203/PDB ID 7L6F (AAV11 empty), EMD-23200/PDB ID 7L6A (AAV12 full), EMD-23201/PDB ID 7L6B (AAV12 empty), EMD-23204/PDB ID 7L6H (AAV13 full), EMD-23205/PDB ID 7L6I (AAV13 empty), respectively.

#### **3. Results and Discussion**

#### *3.1. The Structures of AAV7, AAV11, AAV12, and AAV13 Capsids Completes the Serotype List*

The capsid structures of AAV serotypes 1–6 and 8–9 have been previously reported [9–15], leaving those of AAV7 and AAV10–13 yet to be determined. AAV10, a member of clade E [3], possesses just a single amino acid (aa) difference (A589T) within VP3 compared to AAVrh.39, for which the capsid structure has been determined [17]. Thus, the AAV10 capsid structure is likely identical to AAVrh.39, especially since the AAVrh.10 capsid, which has several aa differences, is already shown to be structurally identical to AAVrh.39 [17]. In contrast, AAV7 (has an 82 aa difference in VP3 vs. AAV8), AAV11 (109 aa vs. AAV4), AAV12 (109 aa vs. AAV4), and AAV13 (28 aa vs. AAV3) are substantially different to their closest sequencerelated AAV serotype, as shown in the parentheses. Thus, to determine their capsid structures recombinant AAV7, AAV11, AAV12, and AAV13 vectors were produced by triple transfection of HEK293 cells followed by purification with AVB affinity chromatography in the case of AAV7, AAV12, and AAV13, and AAVX affinity chromatography in the case of AAV11, as described in the methods. While the affinity purification resulted in highly pure AAV capsid preparations, it did not separate empty (no genome) and genomecontaining (full) capsids, and thus, both types of capsids were observed in cryo-EM micrographs (Figure 1A).

**Figure 1.** Cryo-electron microscopy (cryo-EM) reconstruction of genome containing (full) and empty AAV7 and AAV11– AAV13 capsids. (**A**) Cryo-electron micrographs showing the presence of full capsids (dark appearance) and empty (light appearance). Scale bar: 50 nm. (**B**) Cross-sectional views of the reconstructed maps determined by cryo-EM reconstruction from full and empty capsids contoured at a sigma (σ) threshold level of 0.9. The reconstructed maps are radially colored (blue to red) according to radial distance to the particle center. This figure was generated using UCSF-Chimera [48].

The distribution of the capsids in the micrographs enabled the independent structural determination of both empty and full capsids using 2D classification, as described previously [46], for each serotype. For AAV7, AAV11, AAV12, and AAV13, the empty/full structures were determined from 40,988/4695, 118,351/10,429, 220,137/40,764, and 56,962/6794 capsids, respectively, to 2.96/3.16, 2.86/3.15, 2.54/2.67, and 2.76/3.00 Å resolution (FSC 0.143), respectively (Table 1). For each of the AAV serotypes, the resolution of the full structures is slightly lower compared to the empty, which is most likely due to fewer capsids used in the reconstructions of the former. Direct comparison of the reconstructed empty and full maps for each AAV serotype in a cross-sectional view clearly showed the electron-dense filled interior of the genome-containing capsids, which is absent from the empty capsids (Figure 1B). Similar to previous observations of full AAV capsid density maps, the majority of the capsid interior is filled except for the region directly underneath the 5-fold channel [17,46]. It has been suggested that the dynamic and flexible VP1/VP2 common region and VP1u could be located in the area under the 5-fold channel in readiness to be externalized through the 5-fold channel, which is a structural rearrangement that is required for its PLA2 enzyme function during the viral life cycle [20].

## *3.2. The AAV7, AAV11, AAV12, and AAV13 Capsid Structures Conserved the AAV Features*

Regardless of whether full or empty maps were analyzed, the different AAV serotypes displayed the characteristic morphological features of other AAVs, e.g., a channel at the 5-fold symmetry axes, trimeric protrusions that surround each 3-fold symmetry axis, and a depression at each 2-fold symmetry axis (Figure 2A). However, the exact morphology of the 3-fold protrusions varies between the different AAV serotypes, with much broader protrusions for AAV11 and AAV12 compared to AAV7 and AAV13. Similarly, the shape and orientation of the 2-fold depression of AAV11 and AAV12 differs from AAV7 and AAV13.

**Figure 2.** The AAV7 and AAV11–13 capsid structures. (**A**) The capsid surface density maps contoured at a sigma (σ) threshold level of 2.0. The maps are radially colored (blue to red) according to distance to the capsid center, as indicated by the scale bar on the right. The icosahedral 2-, 3-, and 5-fold axes as well as the 2/5-fold wall are indicated on the AAV7 capsid map. (**B**) The AAV7 and AAV11–13 amino acids modeled for the βG strand are shown inside their respective density maps (black mesh). The amino acid residues are as labeled and shown in stick representation and colored according to atom type: C = yellow, O = red, N = blue, S = green. This figure was generated using UCSF-Chimera [48].

The reconstructed maps of the four AAV serotypes, empty and full, showed wellordered amino acid side-chain densities (Figure 2B) throughout the VP structure starting at aa position 218–220 (AAV7 numbering), which is comparable to the other currently determined AAV serotype capsid structures [9–15]. The only exception was the apex of surface loop VR-IV in AAV7, where aa 455–458 (GGTAG) were disordered, preventing the reliable placement of main- and side-chain residues. A similar disorder was previously observed in AAVrh.10 and AAVrh.39 that share the same or very similar sequence at the apex of the loop, GGTAG and GGTQG, respectively [17]. The glycines on both sides of the apex likely confer the flexibility of this loop and thus the cause of the lack of structural order. AAV11–13 do not possess this accumulation of glycines at this loop; thus, their loops were structurally ordered.

## *3.3. The Full Capsids of AAV7, AAV11, AAV12, and AAV13 Show Ordered Nucleotides*

Similar to previous observations, a structural comparison of empty to the full capsids for the individual AAV serotypes showed them to be largely identical with overall Cα RMSDs ranging from ≈0.2 to 0.3 Å [17,46]. However, a major difference is the observation of weakly ordered density in the interior of the capsid maps in the full structures interpreted as the packaged genome (Figure 1B). This density extends into a pocket underneath the 3 fold symmetry axis and has been interpreted as deoxyadenosine monophosphate (dAMP), which is positioned between conserved prolines 421/632 and histidine 631 (AAV7 numbering) (Figure 3A). We hypothesized that the genome interacts with this 3-fold region of the

interior capsid by binding within the pocket to two symmetry-related VP monomers [17]. Due to the imposed icosahedral symmetry during 3D image reconstruction and the fact that the genome cannot follow this symmetry, other nucleotides leading in and out of this pocket are weakly ordered and cannot be reliably modeled. We postulate that as reconstruction methods improve, relaxation of the enforced icosahedral symmetry in future structure determination efforts may allow the observation of a more ordered DNA structure.

**Figure 3.** Empty and full Adeno-associated virus (AAV) density map differences. (**A**) The modeled AAV7 and AAV11–13 residues at the nucleotide binding pocket with their respective mesh density maps (black = empty, red = full). The extra density exclusively in the full maps was interpreted as an ordered nucleotide (deoxyadenosine monophosphate, dAMP). (**B**) Dual conformation of histidines (e.g., H230 in AAV7). This histidine adopts alternative side-chain conformations primarily in the absence of packaged DNA with the exception of AAV12. Atom colors: C = yellow, O = red, N = blue, S = green, P = orange. This figure was generated using UCSF-Chimera [48].

While the VP structures of empty and full capsids were largely identical, some alternative side-chain orientations, e.g., histidine 230 (AAV7 numbering), were observed. In the AAV serotype structures determined in this study, the histidine side chain preferred the "left" orientation in full capsids (Figure 3B). However, in AAV12 and AAV13, weak density was also observed toward the "right" orientation. In contrast, both orientations are equally adopted in empty capsids, except for AAV12, where the right orientation appears to be favored (Figure 3B). The dual conformation of H230 was previously observed in empty AAVrh.10 capsids [17]. While the cause of this difference between empty and full capsids is unknown, disordered density at low sigma level in the full maps, likely from the packaged genome, appears to contact the histidine side chain in the "right" orientation and thereby induce this preferred conformation of the side chain. Furthermore, H230 is located near the 5-fold symmetry axis, and the different conformation may be related to the observed differences underneath the 5-fold region in both types of capsids (Figure 1B).

## *3.4. The AAV7, AAV11, AAV12, and AAV13 Capsid Structures Display Diversity in Surface Loop Conformations*

The AAV7, AAV11–13 VP topologies conserve the core eight-stranded anti-parallel β-barrel (βB-βI), with the additional β-strand A and α-helix A (Figure 4A), as described previously for all other AAV structures [9,11–15,17–19,46,55–59]. When superposed, these core regions are homologous for the AAV serotypes (Figure 4A). Between the β-strands, the loops that form the surface of the capsids provide the structural variability among different AAVs. The nine VRs, I-IX previously defined [15], provide serotype-specific functions.

**Figure 4.** Structural comparison of AAV2, AAV7, and AAV11–13. (**A**) Structural superposition of AAV2 (blue), AAV7 (cyan), AAV11 (burgundy), AAV12 (salmon), and AAV13 (wheat) shown as ribbon diagrams. The positions of β-strands, the N- and C-terminus, the variable regions (VRs), and the icosahedral 2-, 3-, and 5-fold axis are indicated. This figure was generated using PyMol [60]. (**B**) Cα–Cα distance plot (in Å) for the AAV7 and AAV11–13 residues relative to AAV2 of the superposed viral protein (VP) structures. The VRs are indicated and the overall VP Cα-RMSD (root mean square deviation) compared to AAV2 shown. The dashed line marks the Cα–Cα distance variation of 1 Å.

Compared to AAV2, the prototype serotype, some of the AAV7 and AAV11–13 surface loops showed only minor structural differences with Cα distances of ≤1 Å such as VR-VIII and the HI-loop (Figure 4B). AAV7 also shows minor structural differences compared to AAV2 in four additional loops (VR-III, VR-V, VR-VI, and VR-IX) and AAV13 in six additional loops (VR-I, VR-III, VR-V, VR-VI, VR-VII, and VR-IX), respectively. Structural variability (Cα distances of <3 Å) was also observed for the DE-loop/VR-II at the 5-fold symmetry axis for all analyzed AAV serotypes. The absence of major differences in the 5-fold region, which includes the HI-loop, are likely due to the common function these loops have to fulfill such as their role in DNA packaging and VP1u externalization [20,21].

Greater structural variability between AAV7 and AAV2 was seen in VR-I, VR-IV, and VR-VII due to single aa insertions (VR-I and VR-IV) or a single aa deletion (VR-VII) (M). In AAV13, the only significant structural difference is observed in VR-IV, which is slightly shorter due to a single aa deletion compared to AAV2. In contrast, major structural variabilities (vs. AAV2) were seen in VR-I, VR-III, VR-IV, VR-V, VR-VI, VR-VII, and VR-IX for AAV11 and AAV12. Consequently, their overall Cα-RMSD for the entire VP is larger than that of AAV7 and AAV13 (Figure 4B). Most notably is the 5 aa insertion in VR-V for both AAV11 and AAV12, relative to AAV2, but also to AAV7 and AAV13 (Figure 4A). In addition, both AAV serotypes possess a single aa insertion in VR-IV and display a different conformation to the other AAV serotypes with the apex of the loop positioned over part of VR-V. This subloop of VR-V and the alternative conformation of VR-IV are responsible for the broader appearance of the 3-fold protrusions of the AAV11 and AAV12 capsids as described above (Figure 2A). On the side of the 3-fold protrusions VR-VI and VR-VII also showed significant differences with a 1 aa deletion in VR-VI and major structural variabilities in AAV11 and AAV12 (Figure 4A,B). Similarly, structural differences of VR-I, VR-III, and VR-IX lead to morphological differences at the 2/5-fold wall. VR-I takes a different conformation in AAV11 and AAV12 compared to AAV2 due to a 3 aa deletion. This is partially compensated by VR-III with a 2 aa insertion resulting in a broader loop without extending the height of the loop (Figure 4A). Finally, VR-IX of AAV11 and AAV12 displays a differential conformation without amino acid insertions or deletions relative to AAV2. This variation is located near the 2-fold symmetry axis, resulting in a slightly wider depression of the AAV11 and AAV12 capsid (Figure 2A). Their overall RMSD of the Cα coordinates for the entire VP of 2.05 and 2.02 Å (vs. AAV2) is greater than the overall Cα RMSD of AAV2 compared to AAV5, the most divergent AAV serotype, with a Cα RMSD of 1.8 Å [46].

## *3.5. The AAV7, AAV11, AAV12, and AAV13 Capsid Structures Display Clade-Specific Surface Features*

The clades for the AAV serotypes were proposed in 2004 [3] based on more than 100 unique isolates from human and non-human primates that were grouped based on their VP phylogenetical similarity (Figure 5A). AAV serotypes 10–13 were described after this study [28,61,62], and thus, they were originally not grouped into the clades.


**Figure 5.** Relationship of the AAV serotypes. (**A**) Cladogram showing the assignment of the AAV serotypes to their clades as proposed by Gao et al. [3]. (**B**) Amino acid sequence identity of the AAV serotypes given as percentage for VP1 and VP3 or the structural identity as a percentage of aligned amino acids within 1Å when superposed. High values are colored in dark blue or orange and lower values in lighter shades of each color, respectively. \* For AAV10, the VP structure of AAVrh.39 was utilized, which varies by a single aa from AAV10.

> AAV7 belongs to clade D (Figure 5A) and is closest related to the clade E members AAV8 and AAV10 based on VP1 or VP3 amino acid sequence ranging from 85 to 88% aa sequence identity (Figure 5B). When AAV7 is superposed onto AAV8, the overall Cα RMSD is 0.75 Å with a structural identity of 93% (Figures 5B and 6A), which is slightly lower than the comparison to AAV2 described above at 0.92 Å. However, the 94% structural identity of AAV7 compared with AAV2 is slightly higher than for AAV7 vs. AAV8 (Figure 5B). Compared to AAV8, the AAV7 VP showed different surface loop conformations in VR-I (1 aa deletion), VR-IV (1 aa insertion), and VR-VII (1 aa deletion), respectively. In fact, these AAV7 loops are unique among all the available AAV serotype capsid structures. VR-I of AAV7 is structurally most similar to AAV1 and AAV6, without deletions or insertions but with amino acid variations resulting in Cα distance variation of up to 3 Å. AAV7's VR-VII is the shortest loop among all AAV serotypes with a 1 aa deletion compared to AAV1-AAV4, and AAV6-AAV13 and a 4 aa deletion compared to AAV5. AAV7 vectors were shown to result in high transduction efficiencies of the CNS and spinal cord after delivery into the cerebrospinal fluid or intravenously [63,64]. This indicates that AAV7 might be able to cross the blood–brain barrier (BBB). However, the proposed residues in AAVrh.10 reported to be responsible for this phenotype [17] are only partially conserved in AAV7, e.g., S269, but not N472, where AAV7 has a threonine. More research is needed to determine if AAV7 has the ability to cross the BBB. AAV7 was shown to bind to several AAV8 antibodies, which are termed ADK8, HL2381, and HL2383 [65,66]. These antibodies utilize AAV8's VR-VIII as its epitope [67,68]. The observed cross-reactivities can be explained by the high structural conservation of VR-VIII between AAV7 and AAV8 (Figure 6A).

**Figure 6.** Structural comparison of AAV7 and AAV11–13 to their closest clade member. (**A**) Left—Structural superposition of AAV7 (cyan) and AAV8 (green) shown as ribbon diagrams. The positions of the N- and C-terminus and the variable regions (VRs) are indicated. This figure was generated using PyMol [60]. Right—Cα–Cα distance plot (in Å) for the AAV7 residues relative to AAV8 when the VP structures are superposed. The VRs are indicated, and the overall AAV7 VP Cα–RMSD compared to AAV8 is shown. The dashed line marks a Cα–Cα distance of 1 Å. (**B**) Structural comparison as in (A) for AAV4 (red), AAV11 (burgundy), and AAV12 (salmon). (**C**) Structural comparison as in (A) for AAV3 (yellow) and AAV13 (wheat).

For AAV11 and AAV12, the closest related AAV serotype is AAV4 (Figure 5A) based on VP1 or VP3 amino acid sequence ranging from 78 to 81% aa sequence identity (Figure 5B). However, the sequence identity of AAV11 and AAV12 to each other is slightly higher (84–87%). Surprisingly, the structural identity between AAV11 and AAV12 is 98%, which is only surpassed by AAV1 and AAV6 with 99% sequence and structural identity (Figure 5B). Compared to all other AAV serotypes, the VP3 sequence identity ranges from 51 to 59% (Figure 5B). Consequently, the AAV4 VP is structurally more similar to AAV11 and AAV12 when superposed (Figure 6B) with a structural identity of 91–92% compared to all other

AAV serotypes with structural identities ranging from 63 to 79% (Figure 5B). In particular, AAV4, AAV11, and AAV12 share the insertion in VR-V and the alternative conformation of VR-IV (Figure 6B). The overall Cα RMSD of AAV11 and AAV12 to AAV4 is 0.68 and 0.70 Å with minor loop variations in VR-II, VR-III (1 aa deletion in AAV11 and AAV12), VR-VII, and VR-VIII (Figure 6B). When AAV11 and AAV12 are compared to each other, minor structural differences were observed in VR-II and VR-VII with an overall Cα RMSD of 0.56 Å. We propose that rather than clonal isolates, these three viruses, AAV4, AAV11, and AAV12 (Figure 5A), should be grouped into a new clade G.

While for AAV4, α2–3 linked sialic acid is described as a receptor [22], the receptor for AAV11 and AAV12 is unknown. For AAV12, HSPG and sialic acids were excluded as a receptor [62], and no binding to the available glycans on an array was shown [29]. Amino acids in AAV4 were suggested to be involved in sialic acid binding, which involves residues in VR-V, VR-VI, and VR-VIII [69]. The amino acids in VR-V and VR-VI are conserved structurally and in residue type in AAV11 and AAV12, unlike those in VR-VIII, which may be the reason why AAV11 and AAV12 do not bind sialic acids. Overall, AAV11 and AAV12 vectors have been rarely used for gene delivery purposes; however, AAV11 was described to possess a tropism for the spleen and smooth muscle [61,70], whereas AAV12 was shown to transduce nasal epithelia efficiently [71].

An interesting difference between AAV4, AAV11, and AAV12 for AAV vector production is the requirement for the assembly activating protein (AAP) for capsid assembly [72,73]. While AAV12 is dependent on the presence of AAP, AAV4 and AAV11 are not. An analysis of residues shared between AAV4 and AAV11 but not with AAV12 revealed a total of 23 aa. Previous studies suggested that interior residues are involved in the AAP function [74,75]. Only four of 23 aa differences are located in the interior of the capsid, A301S, A338T, I619V, and R688H (first aa type = AAV4/11, second aa type = AAV12). Of these aa positions, 619 is likely not the determining factor, since AAV11's isoleucine is conserved in most AAV serotypes. A301S and R688H are located near the 2-fold symmetry axis, which is the previously suggested region for AAP binding [74]. More research is needed to confirm the importance of the residues for capsid assembly.

For AAV13, the closest related AAV serotype is AAV3 (Figure 5A) based on VP1 or VP3 amino acid sequence ranging from 94 to 95% aa sequence identity, which is followed by AAV2 with 88–90% (Figure 5B). Nonetheless, when superposed, AAV13 is structurally slightly more similar to AAV2 compared to AAV3 (Cα RMSD: 0.65 Å vs. 0.72 Å and structural identity: 96% vs. 92%) (Figure 4B, Figure 5B, and Figure 6C). In addition to the significant difference in VR-IV caused by 2 aa deletion of AAV13 relative to AAV3, there are also minor variations in VR-I, VR-II, and VR-V (Figure 6C). Common to all of these three AAV serotypes is their ability to bind to HSPG [25,28,29,76] and to the A20 antibody [77]. For AAV13, a critical aa for this binding is K528, which is not present in AAV2 or AAV3 [28]. This residue is located on the side of the 3-fold protrusion within VR-VI. The mutation of K528 to glutamic acid results in the inability to bind HSPG [28,29]. Interestingly, this residue is in a structural equivalent position to AAV6-K531 reported to be important for HSPG binding of AAV6 [78]. AAV13 vectors have currently not been used for gene delivery purposes. Thus, more research is needed to determine its tropism and transduction efficiency.

#### **4. Conclusions**

This study determined the capsid structures of AAV7, AAV11, AAV12, and AAV13, thereby completing the panel of available structures for all currently defined AAV serotypes. While these capsids conserve the AAV capsid features such as the 5-fold channels, protrusions around the 3-fold symmetry axes, depressions at the 2-fold axes, as well as nucleotide binding pocket, they also display surface loops that are not found in any other AAV serotype structure. These separate AAV7 from its closest related AAV members contained within clade E (such as AAV8, AAV10, or AAVrh.10).

AAV11 and AAV12 share structural similarity to AAV4 with loop conformations that are also not found in other AAV serotypes. Thus, while not defined as such, AAV4, AAV11, and AAV12 might form a separate clade. Lastly, AAV13 with AAV3 as its closest related AAV serotype shares structural similarity to both AAV2 and AAV3. It likely belongs to clade C, which was previously described to contain AAV2–AAV3 hybrid members.

The definition of clades suggests antigenic specificity, with members being crossreactive [3]. However, recent data shows that members of different clades cross-react, and thus the clade definition requires revisiting [66]. The completion of the AAV serotype structural atlas, providing visualization of the conserved and variable regions, shows that the serotypes can also be grouped based on structural morphology. These structures provide a template for engineering the AAV capsids for targeted tissue tropism and the escape of recognition by host antibodies toward improved vector efficacy.

**Author Contributions:** M.M. was responsible for sample production and purification, cryo-reconstruction, structure refinement and analysis, model building and refinement, and manuscript preparation. A.J. was responsible for sample production and purification. P.C. vitrified sample and screened cryo-EM grids. N.B. and N.D. collected cryo-EM data. R.M. supervised the project and contributed to manuscript preparation. M.A.-M. conceived and supervised the project, analyzed all results, and contributed to manuscript preparation. All authors have read and agreed to the published version of the manuscript.

**Funding:** The TF20 cryo-electron microscopes was provided by the UF College of Medicine (COM) and Division of Sponsored Programs (DSP). Data collection at Florida State University was made possible by NIH grants S10 OD018142–01 Purchase of a direct electron camera for the Titan-Krios at FSU (PI Taylor), S10 RR025080–01 Purchase of a FEI Titan Krios for 3-D EM (PI Taylor), and U24 GM116788 The Southeastern Consortium for Microscopy of MacroMolecular Machines (PI Taylor). The University of Florida COM and NIH GM082946 provided funds for the research efforts at the University of Florida.

**Acknowledgments:** The authors thank the UF-ICBR Electron microscopy core for access to electron microscopes utilized for cryo-electron micrograph screening.

**Conflicts of Interest:** MAM is a SAB member for Voyager Therapeutics, Inc., and AGTC, has a sponsored research agreement with Voyager Therapeutics and Intima Biosciences, Inc. and is a consultant for Intima Biosciences, Inc. MAM is a co-founder of StrideBio, Inc. This is a biopharmaceutical company with interest in developing AAV vectors for gene delivery application.

## **References**


## *Article* **Characterization of the GBoV1 Capsid and Its Antibody Interactions**

**Jennifer Chun Yu 1, Mario Mietzsch 1, Amriti Singh 1,†, Alberto Jimenez Ybargollin 1, Shweta Kailasan 1,‡, Paul Chipman 1, Nilakshee Bhattacharya 2,§, Julia Fakhiri 3,***-***, Dirk Grimm 3, Amit Kapoor 4, Indre Kuˇ ˙ cinskaite-Kodz ˙ e˙ 5, Aurelija Žvirbliene˙ 5, Maria Söderlund-Venermo 6, Robert McKenna <sup>1</sup> and Mavis Agbandje-McKenna 1,\***


**Abstract:** Human bocavirus 1 (HBoV1) has gained attention as a gene delivery vector with its ability to infect polarized human airway epithelia and 5.5 kb genome packaging capacity. Gorilla bocavirus 1 (GBoV1) VP3 shares 86% amino acid sequence identity with HBoV1 but has better transduction efficiency in several human cell types. Here, we report the capsid structure of GBoV1 determined to 2.76 Å resolution using cryo-electron microscopy (cryo-EM) and its interaction with mouse monoclonal antibodies (mAbs) and human sera. GBoV1 shares capsid surface morphologies with other parvoviruses, with a channel at the 5-fold symmetry axis, protrusions surrounding the 3 fold axis and a depression at the 2-fold axis. A 2/5-fold wall separates the 2-fold and 5-fold axes. Compared to HBoV1, differences are localized to the 3-fold protrusions. Consistently, native dot immunoblots and cryo-EM showed cross-reactivity and binding, respectively, by a 5-fold targeted HBoV1 mAb, 15C6. Surprisingly, recognition was observed for one out of three 3-fold targeted mAbs, 12C1, indicating some structural similarity at this region. In addition, GBoV1, tested against 40 human sera, showed the similar rates of seropositivity as HBoV1. Immunogenic reactivity against parvoviral vectors is a significant barrier to efficient gene delivery. This study is a step towards optimizing bocaparvovirus vectors with antibody escape properties.

**Keywords:** bocavirus; capsid; parvovirus; cryo-EM; gene therapy; antigenicity

## **1. Introduction**

Gorilla bocavirus 1 (GBoV1) is a member of the genus *Bocaparvovirus* in the *Parvoviridae* that contain single-stranded DNA (ssDNA) packaging viruses [1]. The family is divided

**Citation:** Yu, J.C.; Mietzsch, M.; Singh, A.; Jimenez Ybargollin, A.; Kailasan, S.; Chipman, P.; Bhattacharya, N.; Fakhiri, J.; Grimm, D.; Kapoor, A.; et al. Characterization of the GBoV1 Capsid and Its Antibody Interactions. *Viruses* **2021**, *13*, 330. https://doi.org/10.3390/ v13020330

Academic Editor: Giorgio Gallinella

Received: 15 January 2021 Accepted: 15 February 2021 Published: 20 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

into three subfamilies, including the subfamily *Parvovirinae* whose members infect vertebrate hosts [1]. Bocaparvoviruses represents the largest genus in this subfamily, with 21 classified species that infect a variety of hosts, including cows, rabbits, rodents, humans and non-human primates [2–10]. Bovine parvovirus (BPV) is the first discovered member of the genus, isolated from cattle in 1959 [3]. The first discovered member infecting humans is human bocavirus 1 (HBoV1), isolated in 2005 from nasopharyngeal aspirates of children under 2 years of age with acute respiratory infections [7]. Enteric strains, HBoV2-4, were then described from children with acute gastroenteritis [8,9]. GBoV1 was isolated from gorillas with enteritis and is the first identified non-human primate bocavirus [10].

The bocaviruses have a ~5.5 kb genome that consist of three open reading frames (ORFs), the *ns* gene on the left end, the *cap* gene on the right end and the *np* gene between *ns* and *cap* [7]. The three ORFs are flanked by two non-identical hairpin structures and are transcribed from a single promoter (p5) to generate a single pre-mRNA that is alternatively spliced [11,12]. The *ns* gene encodes non-structural proteins that are essential for viral DNA replication [11]. The *np* gene encodes the NP1 protein, which was shown to be play a role in pre-mRNA processing and subsequent capsid protein expression [11,13]. The structural proteins that form the viral capsid, VP1, VP2 and VP3, are encoded by the *cap* gene [14]. Sixty copies of VP1, VP2 and VP3 assemble oneT=1 icosahedral capsid in an approximate 1:1:10 ratio in 2-, 3-, 5-related symmetries [14,15]. The three VPs share the same C-terminus. VP3 is the major capsid protein and is the smallest of the three VPs. VP2 shares a common region with VP1 at the N-terminus, called the VP1/2 common region [16]. The unique region of the minor VP1 protein N-terminus (VP1u) contains a phospholipase activity (PLA2) shown to be responsible for endosomal escape during trafficking to the nucleus and is absolutely required for infectivity [17–19]. The VP1u is hypothesized to externalize through a channel located at the 5-fold axis of the capsid to activate its PLA2 activity [20,21]. While VP1 incorporation is essential for viral infectivity, the VP3 protein alone was shown to form intact capsids, termed VP3-only capsids, with similar antigenicity to wild-type capsids [22,23]. It is the capsid that interacts with the host environment and is a determinant of host and cell recognition, host immune response and cell entry [24].

Previously, the capsid structures of HBoV1-4 have been reported that showed conserved features across the *Parvoviridae* family, as well as features unique to the genus [23,25]. The bocavirus VP monomer has a conserved eight-stranded β-barrel motif (βB to βI), forming the interior of the capsid, with a βA strand that runs antiparallel to βB and an α-helix (αA) located between strands βC and βD of the β-barrel [16]. The loops between the β strands form the surface of the capsid and these surface loops are labeled based on the flanking β strands, for example, the loop between βD and βE is the DE loop. Within the loops are defined variable regions (VRs), ranging from VR-I to VR-IX. These variable regions (VRs) were previously defined for the bocaviruses based on comparisons to the structure of bovine parvovirus (BPV), the first capsid structure solved from the *Bocaparvoviridae* genus [26]. In addition to the conserved β-barrel motif, βA and αA, the bocaviruses were reported to possess a unique α-helix (αB) located near VR-III as well as a basket-like structure beneath the 5-fold channel [23].

Recently, HBoV1 and GBoV1, along with enteric human strains HBoV2-4, were proposed as gene therapy delivery vectors. The interest in HBoV1 stems from its specific tropism for the apical side of polarized human airway epithelia (pHAE), which is optimal for the treatment of cystic fibrosis [27,28]. Gene therapy has been the "gold standard" for the treatment of monogenetic diseases such as cystic fibrosis but Adeno-associated virus serotype 2 (AAV2), a vector used for treatment was shown to be inefficient at delivery to the lung and has tropism for the basolateral side of pHAE [29–31]. In addition to the favorable tropism of HBoV1, the expanded genome capacity of bocaviruses (5.5 kb), compared to AAV, allows the packaging of the full-length 4.7 kb *CFTR* gene [7,28]. Due to these advantages, various studies have aimed to optimize a recombinant (r)AAV2/HBoV1 pseudotyped vector, a HBoV1 capsid-based vector that packages a transgene with rAAV2

inverted terminal repeats (ITRs), for delivery of the *CFTR* gene. This vector was shown to have tropisms similar to the HBoV1 wild-type virus [28,32,33].

Despite the advantages of HBoV1 as a gene therapy vector, the capsid has high seroprevalence in the human population, a hurdle to therapeutic gene delivery. GBoV1 is an alternative to HBoV1, as it is capable of also infecting the apical side of pHAE efficiently, package 5.5 kb and less susceptible to neutralization by pooled intravenous immunoglobulin (IVIgs) [33]. The goal of this study was to characterize the GBoV1 capsid and to better understand the functional regions including antigenicity of the capsid. We report the high-resolution structure of the GBoV1 capsid, determined by cryo-electron microscopy (cryo-EM) and 3D single-particle reconstruction. The GBoV1 VP3 monomer conserves features common to parvoviruses and contains features unique to bocaviruses, for example, α-helix αB. Within the capsid interior is a 5-fold basket-like density, which appears smaller when compared to HBoV1-4. The main differences between the HBoV1 and GBoV1 monomer are localized to VR-I, VR-III and VR-V. Low-resolution structures of the GBoV1 capsid complexed with mouse monoclonal antibodies (mAbs) 15C6 and 12C1, originally generated against HBoV1, show epitopes localized to the 5-fold and 3-fold axes, respectively, in agreement with reports for HBoV1 [22]. However, two other mAbs, targeted at the HBoV1 3-fold axis, did not recognize GBoV1 indicating structural variation at this capsid region between the two viruses. Both viruses showed comparable high seroprevalence against human sera suggesting a high degree of cross-reactivity for the sera tested and a conserved capsid region, such as the 5-fold region, as forming the epitopes. These observations begin to unravel the antigenic properties of GBoV1 and provide information that could aid engineering vectors with reduced antigenic reactivity and, thus, therapeutic efficacy.

## **2. Methods**

## *2.1. Virus-Like Particle Production and Purification*

The Bac-to-Bac baculovirus system was used for the expression of HBoV1, HBoV4, GBoV1, AAV2 and AAV5 virus-like particles (VLPs) as described previously for HBoV1 (VP3 only), AAV2 and AAV5 [22,34]. For the generation of GBoV1 expressing VP1, VP2 and VP3 (termed wild-type), the entire *cap* gene (NCBI accession no. NC\_014358.1) was cloned and inserted into the pFastBac plasmid with an ACG start codon for VP1. For the generation of GBoV1 VP3-only VLPs, the VP3-encoding sequence (without the VP1 unique region and the VP1/2 common region) was directly cloned and inserted into the pFastbac plasmid for transposition into the baculovirus expression vector. Based on the standard manufacturer's protocol, the baculovirus expression vectors were then used to generate recombinant baculovirus stocks expressing GBoV1 wild-type and VP3-only VLPs [35]. Briefly, *Sf* 9 insect cells, maintained in SFM Sf9-900 medium (Thermo Fisher, Waltham, MA, USA) with 10% fetal bovine serum (FBS) and antibiotics, were infected at a multiplicity of infection (MOI) of 5 and harvested 72 h post-infection. Sucrose cushion and sucrose density gradients were performed for the purification of VLPs after three rounds of freeze-thaw cycles and benzonase (Millipore) treatment of the cell pellets, as described previously [15]. Purified samples were dialyzed into 1× phosphate-buffered saline (PBS) (2.8 mM KCl, 137 mM NaCl, 10 mM Na2HPO4, 1.8 mM KH2PO4) and concentrated to 0.5–2 mg/mL using Apollo concentrators (Orbital Biosciences, Topsfield, MA, USA). Purity and capsid integrity of the VLPs was confirmed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and negative stain electron microscopy analysis using a Tecnai G2 Spirit electron microscope (FEI, Hillsboro, OR, USA) at 120 kV, respectively, as described previously [22].

## *2.2. Dot Immunoblot Analysis*

In dot immunoblots HBoV1 VP3-specific mAbs were probed against GBoV1 (VP3 only and WT), HBoV1 (VP3 only), HBoV4 (VP3 only), AAV2 and AAV5 native capsids. To determine the seroprevalence, human sera from healthy donors (Valley Biomedical, Winchester, VA, USA) were utilized in dot immunoblots against GBoV1, HBoV1 (VP3 only), AAV2 and AAV5. H1-H1 polyclonal antibody (rabbit double-immunized with HBoV1 VP3 VLPs) was

used against denatured capsids as a positive BoV control [36]. Dot blots were performed on nitrocellulose membrane dipped in 1× PBS. Denatured (incubating capsids at 100 ◦C for 5 min) and non-denatured VLPs were applied directly to the membrane at approximately 10 ng−<sup>1</sup> μg using a vacuum manifold, letting the sample incubate for 10 min. The membranes were blocked in 6% milk in 1× PBS overnight at 4 ◦C. For the primary antibody, mAb 15C6 was added at a 1:2000 dilution; mAbs 12C1, 4C2 and 9G12 [22] were added at a 1:1000 dilution; H1-H1 was added at a 1:1000 dilution; human sera samples were added at a 1:500 dilution in 6% milk in PBS-T (PBS with 0.1% Tween) and incubated on the membrane for 1.5 h at room temperature (RT). Secondary anti-mouse and anti-rabbit antibodies were applied to the membrane in a 1:5000 dilution and anti-human IgG was applied in a 1:50,000 dilution in 6% milk in PBS-T for 1 h at RT. Five min washes were performed three times before and after incubation with secondary antibody. Blocking, primary and secondary antibody steps were all performed on a shaker. Finally, luminol substrate was applied to the membrane and incubated in the dark for 1 min, before the membrane was exposed to X-ray radiography film and developed.

## *2.3. Generation of Fab Fragments*

Production of the 15C6, 12C1, 4C2 and 9G12 mAbs from BALB/c mice injected with HBoV1 VLPs and subsequent purification of IgGs were previously described [22]. IgG from 15C6 and 12C1 at a concentration of approximately 1–2 mg/mL was bufferexchanged into 20 mM sodium phosphate, pH 7.0, 10 mM EDTA for papain cleavage. Papain was added to the IgG samples and incubated for 16–20 h at 37 ◦C with rotation. The papain-IgG mixture was centrifuged at a low speed (1000× *g*) and the supernatant containing cleaved Fab fragments was loaded onto a protein A column. The Fc portion of the cleaved IgG was captured in the protein A column and flowthrough containing the desired Fab fragments was collected. This flowthrough was concentrated to ~0.5 mg/mL in an Apollo concentrator with a 9 kDa molecular mass cutoff. sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) was performed to confirm the purity of the Fab fragment sample.

## *2.4. Preparation of GBoV1-Fab Complexes and GBoV1 VLPs for Cryo-EM Data Collection*

GBoV1 VLPs were mixed with the 15C6 and 12C1 Fab fragments at a 1:120 to 1:180 (VLP:Fab) ratio to ensure binding-site saturation. The VLP: antibody complexes were incubated on ice for 30 min to 1 h and 3 μL of sample was vitrified onto C-flat holey carbon grids (Protochips, Inc.) using the Vitrobot Mark IV (FEI Co.). For the GBoV1 VLPs (wild-type and VP3-only), 3 μL of sample was prepared as described for the complexes.

### *2.5. Cryo-EM Data Collection*

For the GBoV1-Fab complexes, micrographs were collected from frozen grids using a Tecnai G2 F20-TWIN transmission electron microscope (FEI) with a 200 kV voltage under low-dose conditions (20 e−/Å2) at a magnification of 82,500× on a 16-megapixel charge coupled device (CCD) camera with pixel size 15 μm, resulting in micrographs with a pixel size of 1.82 Å. This microscope and camera were also used for screening grids for ice quality and particle distribution of GBoV1 prior to high-resolution data collection and also for collecting a low-resolution data set for comparison to the complex structures. For the GBoV1 VLP-alone high-resolution studies, holey carbon grids with vitrified VLPs were used to collect micrograph movie frames on a Titan Krios electron microscope (FEI Co.) operated at 300 kV with a K3 DED using the Leginon application. High-resolution data collection was performed with a total dose of 60 to 67 e−/Å<sup>2</sup> for up to 50 movie frames per micrograph. The movie frames collected on the K3 detector were aligned using MotionCor2 with dose weighting as previously described [37]. Data sets were collected as part of the NIH project "Southeastern Center for Microscopy of MacroMolecular Machines" (SECM4).

## *2.6. 3D Particle Reconstruction*

The cisTEM software package was used for three-dimensional (3D) image reconstruction of both, wild-type and antibody-complex structures [38]. The aligned micrographs were first imported and their microscope-based contrast transfer function (CTF) estimated. Suboptimal-quality micrographs were eliminated. Capsids on the remaining micrographs were automatically selected using a particle radius of 125 Å. The selected capsids were subjected to 2D classification and undesirable classes, such as ice or impurities, were removed from the dataset. Both, *ab-initio* 3D reconstruction and automatic refinement was performed under default settings. *Ab-initio* 3D reconstruction generated an initial lowresolution model with 10% of the total boxed particles with imposed icosahedral symmetry and automatically refined with the entire dataset. Map sharpening of the high-resolution structure used a pre-cut off B-factor value of 90 Å<sup>2</sup> and variable post-cut off B-factor values such as 0, 20 and 40 Å2. Using the UCSF-Chimera software, the sharpened density maps were analyzed and the −90 Å2/0 Å2 map was used for further model building and structure refinement. The final resolution of the structures was estimated based on a Fourier shell correlation (FSC) threshold criterion of 0.143 (Table 1).

**Table 1.** Summary of data collection, processing and refinement statistics.


#### *2.7. Model Building and Structure Refinement*

The 3D model for GBoV1 wild-type VP3 monomer was generated from the protein sequence (NCBI accession ADK34012.1) in the online program SWISS-MODEL using the structure of HBoV1 (Research Collaboratory for Structural Bioinformatics [RCSB] PDB code 5URF) as a template [39]. This reference monomer model was used to generate a 60mer (based on 60 copies of the VP3 protein) with the VIPERdb2 oligomer generator [40] and docked as rigid bodies into the GBoV1 density map using the "fit in map" subroutine in UCSF-Chimera [41]. The docked VP monomer model was adjusted to better fit the GBoV1 wild-type cryo-reconstructed density map with manual model-building tools and real-space-refine options in Coot [42]. Further refinement was performed on the model with PHENIX, using the real-space-refinement subroutine under default settings for five macrocycles [43]. This refined model was inspected in Coot and amino acid side chains are adjusted, if needed, for favorable statistics. After another round of refinement in PHENIX, an icosahedral model was generated from 60 copies of the refined VP3 monomer with the

VIPERdb2 oligomer generator. The 60-mer VP3 monomer was further refined in PHENIX using B-factor refinement options.

#### *2.8. Antibody Epitope Mapping*

The high-resolution 60-mer GBoV1 structure was rigid body-docked into the antibodycomplex density maps, using the "fit in map" subroutine in UCSF-Chimera. A generic Fab (PDB ID: 2FBJ) was fitted into the density of the Fab using the same subroutine in UCSF-Chimera [41]. The resulting pseudo-atomic model was used to generate a roadmap using RIVEM [44]. The contact residues, residues in the interface between the capsid and antibody structure, were identified via manual inspection in the program Coot [42]. Occluded residues were also identified manually by generating a roadmap in RIVEM using the GBoV1 structure and generic Fab.

### *2.9. Sequence and Structural Comparison*

The VP3 models of HBoV1-4 and GBoV1 were analyzed in Coot and using the superposition tool. Overall paired root mean squared deviations (RMSD) were calculated between Cα positions. The distances between Cα positions of regions with insertions or deletions were manually measured in Coot with the distance tool. Regions with two or more adjacent amino acids and a greater than 2 Å difference determined by Coot are considered to be structurally diverse and are assigned to previously described VRs [42].

## *2.10. Structure Accession Numbers*

The GBoV1 WT cryo-EM reconstructed density map and model built for the capsid were deposited in the Electron Microscopy Data Bank (EMDB) with accession numbers EMD-23460 and PDB ID 7LNK, respectively.

## **3. Results & Discussion**

## *3.1. GBoV1 Shares Conserved Capsid Features with the HBoVs*

The GBoV1 VLPs were produced using recombinant baculovirus expressing either GBoV1 VP1, VP2 and VP3 (GBoV1 WT) or only VP3 (GBoV1 VP3-only) in *Sf* 9 insect cells. The purified GBoV1 WT sample was analyzed on SDS-PAGE to confirm the presence and purity of VP1, VP2 and VP3 with a corresponding molecular weight of approximately 60, 65 and 80 kDa. For GBoV1 VP3-only, only one band is present at 60 kDa (Figure 1a). Cryo-EM micrographs confirmed the presence of intact capsids with a diameter of approximately 250 Å without the presence of contaminants (Figure 1a). Thus, the samples were deemed suitable for data collection for high-resolution structure determination and movie frame micrographs were collected. 3D image reconstruction of 168,565 GBoV1 WT and 218,746 GBoV1 VP3-only capsids resulted in structures with an estimated resolution of 2.76 Å for both types of capsids based on an FSC threshold of 0.143 (Figure 1b–d, Table 1).

The GBoV1 capsid structures were identical, despite being assembled from different VP compositions (Figure 1a). They share the conserved features of the *Parvoviridae* subfamily, with a channel at the 5-fold symmetry axis, protrusions around the 3-fold axis, the 2/5-fold wall, located between the depressions at the 2- and 5-fold axes and depressions at the 2-fold as well as around the 5-fold axis (Figure 1b,c) [16]. The basket-like structure beneath the 5-fold channel previously reported within the capsids of HBoV1-HBoV4 and BPV [15,23,25,26], was less pronounced in GBoV1 suggesting less order at the N-terminus of the VP (Figure 1c). This basket contains residues located at the N-terminus of VP3 and is part of a glycine-rich region hypothesized, for parvoviruses, to act as a hinge for the externalization of the VP1u to utilize its PLA2 activity. The location of this density below the 5-fold axes is consistent with the suggested use of this channel for the VP1u externalization [19,20]. Interestingly, this region of GBoV1, residues 1–32 (VP3 numbering), is similar to the analogous regions of HBoV2-4 (aa1-32) with a sequence identity ranging from 69–79% but shares only a 50% sequence identity with HBoV1 despite the higher sequence identity across the entire VP3 for all five viruses (Table 2).

**Figure 1.** The capsid structure of Gorilla bocavirus 1 (GBoV1). (**A**) sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) of GBoV1 WT and VP3 only samples confirming the presence of VP1, VP2 and VP3 (~80, 65, 60 kDa) and cryo-electron micrograph showing intact viral particles. (**B**) Capsid density map of GBoV1 WT contoured at sigma (σ) threshold of 1.0. The radial distance from the center measured in Å is colored as shown. Arrows point to the 5-fold, 3-fold or 2-fold symmetry axis and the 2/5-fold wall. (**C**) Cross-sectional view of GBoV1 WT density map. (**D**) Fourier shell correlation (FSC) plot for the cryo-reconstruction with an estimated resolution of 2.76 Å at an FSC threshold of 0.143. Resolution (Å) is presented using a log scale. (**E**) Atomic model of amino acids 55–61 (βB) represented within their density map contoured at a σ threshold level of 1. C = yellow, O = red and N = blue. Panels B, C and E were made using UCSF-Chimera [41].


**Table 2.** HBoV1-4 and GBoV1 primary VP3 sequence identity (bottom left) and structural identity (top right).

In both the GBoV1 structures, excluding residues 1–32, residue 33 to the last C-terminal residue, aa542 (VP3 numbering), were structurally ordered and models could be built into the cryo-reconstructed density maps (e.g., in Figure 1e). The densities of the amino acid side chains were well-defined for the β-strands and most of the surface loops. Some acidic residue side chain densities were less defined. This observation is caused by a high sensitivity of these residue types to radiation damage as has been reported in other highresolution cryo-constructed maps [45]. The GBoV1 VP3 structure conserved the parvovirus features, including the eight-stranded β-barrel motif, α-helix A and β-strand A (Figure 2a). The structure also featured α-helix B, a region unique to the bocaparvoviruses and the ten defined VRs, VR-I to VRVIIIB and VR-IX (Figure 2a). The GBoV1 WT model refinement statistics (Table 1) are consistent or better than for structures reported at this resolution by cryo-reconstruction for other bocaparvoviruses as well as other parvoviruses [16,23]. The

root mean squared deviation (RMSD) between the VP models built into the GBoV1 WT and VP3 reconstructed density maps is 0.32 Å. Due to this high similarity, only the GBoV1 WT model will be used for further analysis.

**Figure 2.** Structural comparison of GBoV1 to the HBoVs. (**A**) The VP3 monomer structure of GBoV1 shown as a ribbon diagram, with the secondary structure elements, N- and C-terminus and VRs labeled. The approximate positions of the icosahedral 2-, 3- and 5-fold axes are indicated as filled oval, triangle and pentagon, respectively. (**B**) VP3 monomer structures of HBoV1 (blue) and GBoV1 (yellow) superposed. The labels are as in panel (**A**). (**C**) VP3 monomer structures of HBoV1 (blue), HBoV2 (orange), HBoV3 (green), HBoV4 (red) and GBoV1 (yellow) superposed. The labels are as in panel (**A**). The color for each model is as given beside panel (**C**). Images were superposed in the Coot program [42] and visualized in the PyMol program [46].

## *3.2. Structural Differences between GBoV1 and the HBoVs Are Localized to the Variable Regions*

The GBoV1 VP3 monomer has high primary sequence identity to the HBoVs, with 86.3% for HBoV1 and ~79% for HBoV2-4 (Table 2). The structural identity was determined by superposing the model of GBoV1 onto the previously published models of HBoV1-4 in Coot (Figure 2a–c) [23,42]. A structure-based sequence alignment was generated using these measured Cα distances, revealing that secondary structures (βI-G, βC-F core, αA and αB) are conserved and the surface loops between these regions are characterized by amino acid substitutions, insertions and deletions that result in structural differences (Figures 2 and 3).

**Figure 3.** Structure-based sequence alignment of Human bocavirus 1 (HBoV1)-4 and GBoV1. The structure-based sequence alignment, starting from the first ordered residue (aa33), was generated using distance values from the Coot [42] superpose tool. Secondary structural elements, β-strands and α-helices, are indicated by blue arrows and red cylinders, respectively. Regions highlighted with orange indicate sequence identity between HBoV1-4 and GBoV1. The locations of the VRs are also indicated based on the previously defined VRs [23]. Amino acid number, based on HBoV1, is shown above the sequences. Structural variability, defined by amino acids whose Cα atoms are >2 Å apart, are offset low and highlighted in red.

The highest structural variabilities between the VP3s of the five viruses compared are localized to VR-I, VR-III and VR-V, with an RMSD of up to 3.3 Å, 4.5 Å and 3 Å, respectively (Figure 3, Table 3). VR-I and VR-III, along with VR-VII and VR-IX, form the 2/5-fold wall, a region of the capsid reported to be important for antigenic reactivity and receptor binding in parvoviruses [16]. VR-I has high structural variability as a result of the primary sequence differences at residues 78–85 (Figure 2b,c and Figure 3). For VR-III.

GBoV1 and HBoV1 are both structurally divergent to HBoV2-4 with a four amino acid insertion at the apex of the EF loop (Figure 3). Due to the four amino acid insertion in HBoV1 (a respiratory virus) that is not present in HBoV2-4 (gastroenteric virus) (Figure 3), VR-III was previously suggested as a region that determines tissue tropism [23]. The GBoV1 VR-III contains two amino acid substitutions relative to HBoV1 at residues 205–206 (VR-III), where NA is switched to TT (Figures 2c and 3). With these substitutions, the RMSD at VR-III for GBoV1 to HBoV1 (4.5 Å) is higher than that of HBoV2-4 (3.1–3.3 Å) (Table 2). The high structural variability between GBoV1 and HBoV1-4 suggests VR-III may also play a role in host tropism. In other parvoviruses, the region analogous to VR-III serves as a determinant for tissue tropism, pathogenicity, transduction efficiency and antigenicity [24]. VR-V along with VR-IV and VR-VIII form the protrusions around the 3-fold axis. The 3-fold protrusions have been shown to be part of an antigenic footprint for HBoV1, as well as to be important for both, antigenicity and transduction efficiency in parvoviruses [16]. Interestingly, the GBoV1 VR-V is more structurally similar to the VR-V of HBoV2-4, compared to HBoV1 (Table 3).


**Table 3.** Local root mean squared deviations (RMSDs) in angstroms (Å) for aligned HBoV1-4 and GBoV1 VRs. Higher values are shaded darker.

> When comparing the GBoV1 and HBoV1 structures, RMSDs of 2.1 Å and 2.5 Å are observed for VR-II and VR-VIIIB (Figure 2b, Table 3). Differences between GBoV1 and the other viruses range from 0.6 to 2.9 Å (Table 3). VR-II is located at the apex of the DE loop, five of which form the 5-fold channel. The 5-fold channel has been proposed to be important for genome packaging and VP1u externalization [24]. VR-II is highly conserved and is part of a cross-reactive epitope between HBoV1, HBoV3 and HBoV4 [22]. The slight structural difference within the VR-II is attributed to the need for this region to be flexible to allow the proposed externalization through this 5-fold channel. VR-VIIIB or the HI loop, is located on the depressions around the 5-fold channel. While HBoV1 structurally is the most divergent compared to HBoV2-4 and GBoV1, this loop is part of a cross-reactive epitope including HBoV1, HBoV2 and HBoV4 [22]. This suggests that only a few residues within the two loops are important for the antibody recognition of the cross-reactive antibody. This region is also reported as being important for genome packaging and capsid assembly of other parvoviruses [47–49].

> For VR-IV, VR-VI, VR-VII, VR-VIII and VR-IX, there is the least (RMSD of 1.1–0.4 Å) structural divergence of GBoV1 compared to HBoV1 (Table 3, Figure 2b). VR-IV forms the protrusions around the icosahedral 3-fold axis [23]. HBoV4 differs from the other four viruses in that it has a two amino acid insertion within this loop, conferring a different conformation to the protrusions around the 3-fold compared to the other bocaparvoviruses (Figures 2c and 3). VR-VI, VR-VII and VR-VIII, all located on the side of the 3-fold protrusions, have minor to no amino acid sequence differences (Figure 3). These VRs have been shown to play a role in antigenicity and also parvovirus infectivity [16,24]. Lastly, VR-IX is located at the 2/5-fold wall. This region is structurally identical for HBoV1- 4 and GBoV1, while having amino acid sequence variations and has been implicated as host tropism determinant for bocaparvoviruses [23]. As an example, BPV has a 7 amino acid deletion in VR-IX compared to the HBoVs [26]. GBoV1 s VR-IX is structurally identical to HBoV1-4 and it was shown to be capable of infecting human cell lines [33], suggesting that this region may govern primate and human cell tropisms.

## *3.3. The GBoV1 Capsid Differs Antigenically to the HBoV1 Capsid*

Antibodies 15C6, 12C1, 4G2 and 9G12 were generated, in mice, against HBoV1 capsids, using the hybridoma technology, in a previous study [22]. These antibodies were tested for reactivity against the GBoV1 capsid using native dot immunoblots (Figure 4). 15C6 and 12C1 were cross-reactive between HBoV1 and GBoV1, whereas 4C2 and 9G12 were specific for HBoV1. Previously, the 15C6 binding footprint was mapped to capsid surface features surrounding the icosahedral 5-fold axis, whereas 12C1, 4C2 and 9G12 were shown to recognize the protrusions surrounding the 3-fold axis [22]. As expected, the reactivity of these HBoV1-specific mAbs were the same for GBoV1 WT and VP3-only capsids (Figure 4). This is consistent with the observation that the epitopes of the two cross-reacting antibodies are located on the capsid surface which is formed by the VP3 common region. The antibody H1-H1, a polyclonal rabbit antibody generated against HBoV1 VP3 VLPs, served as a positive control, detecting denatured VLPs [36].

**Figure 4.** Cross-reactivity of GBoV1 capsids with HBoV1 antibodies via native dot blot. 1011, 1010 or 109 viral capsids were loaded onto a nitrocellulose membrane and tested against H1-H1 (positive control for denatured virus-like particles (VLPs)) and HBoV1 antibodies 15C6, 12C1, 4C2 and 9G12 (detecting conformational epitopes). 1011 not shown for 15C6 and H1-H1 due to overexposure.

Cryo-EM and 3D image reconstruction were used to determine the structures of the 15C6 and 12C1 Fabs complexed with the GBoV1 WT capsid (Figure 5). A total of 1895 individual capsid complexes were used for the reconstruction of the GBoV1:15C6 and 4108 of the GBoV1:12C1 complexes, with estimated resolutions of 6.4 Å and 6.2 Å, respectively, based on a FSC threshold level of 0.143 (Figure 5a). For direct comparison, a low resolution GBoV1 WT-capsid structure was determined to 5.3 Å from 17,284 capsids (Figure 5a,b,e).

The density maps of the GBoV1:15C6 complex showed density corresponding to the bound 15C6 Fabs surrounding the 5-fold channel (Figure 5c). For the GBoV1:12C1 complex, the 12C1 Fab was bound on the protrusions surround the 3-fold axis (Figure 5d). A 0.5 σ threshold density map was used to visualize both, the complimentary-determining regions (CDR) and the constant regions of the Fab (Figure 5d). At 1σ threshold, five copies of the Fab are visible of the GBoV1:15C6 complex but only the CDR for the GBoV1:12C1 complex (not shown). The visible surface features, 3-fold protrusions and 2/5-fold wall, for the GBoV1:15C6 complex is consistent with the low-resolution GBoV1 WT structure. The 5-fold channel and depressions surrounding the channel is also consistent in the GBoV1:12C1 complex with the low-resolution GBoV1 WT structure (Figure 5b). Underneath the 5-fold channel, a basket-like density can be seen in all three structures (Figure 5e–g), consistent with improved ordering at lower resolution as observed in the low- and high-resolution structures of BPV and HBoV1-4 [15,23].

**Figure 5.** Antibody epitopes on GBoV1 capsid localized to 5- and 3-fold axes for 15C6 and 12C1. (**A**) FSC plots for the cryo-reconstruction with an estimated resolution value of 5.3Å, 6.4 Å and 6.2 Å, at an FSC threshold value of 0.143 for GBoV1, GBoV1:15C6 and GBoV1:12C1, respectively. Resolution (Å) is presented using a log2 scale. (**B**) Capsid density map of GBoV1 contoured at σ threshold of 1.0. (**C**) Capsid density map of GBoV1 complexed with 15C6 (GBoV1:15C6) contoured at σ threshold of 1.0. (**D**) Capsid density map of GBoV1 complexed with 15C6 contoured at σ threshold of 0.5. (**E**) Cross-sectional view of the GBoV1 complex density map. (**F**) Cross-sectional view of the GBoV1:15C6 complex density map. (**G**) Cross-sectional view of the GBoV1:12C1 capsid density map.

Rigid-body docking of the refined 60-mer model of GBoV1 WT was performed for the GBoV1:15C6 and GBoV1:12C1 density maps, along with a generic Fab (PDB ID: 2FBJ), with a CC of 0.93 and 0.90, respectively (Figure 6). Steric clashes at the 5-fold axis for the individual 15C6 Fabs likely resulted in local disorder for the Fab (Figure 6). For 12C1 it appears that enough space is available at the 3-fold protrusions (Figure 5) but disorder also is observed. A 2D stereographic projection (roadmap) representation of the complex, generated based on the fitted models, identified the contact and occluded (within Fab footprint but not contacting capsid) residues (Figure 7) [44]. The 15C6 epitope lines the 5-fold channel, encompassing the residues that form VR-II (residues 142-GAD-144) and VR-VIIIB (residues 460-STNA-463), which are the DE and HI loops, respectively (Figure 7a). These residues are conserved between GBoV1 and HBoV1, with the exception of residue S460, which is A460 in HBoV1. The high conservation within this sequence region as well as high structural similarity explains the cross-reactivity between the GBoV1 and HBoV1 capsid for the 15C6 antibody. This antibody also cross-reacts with HBoV2 and HBoV4 [22]. The 12C1 epitope sits on the protrusions around the 3-fold axis and contains

contact residues from VR-I (80-SNGN-83), VR-IV (276-IRQNGQTTA-284) and VR-VIII (390-NQTT-393) (Figure 7b). Compared to HBoV1, the GBoV1 VR-IV and VR-VIII are structurally identical despite having amino acid differences (Figure 3). This structural identity likely dictates the cross-reactivity of 12C1 for the GBoV1 capsid. Interestingly, antibodies 4G2 and 9G12, also recognizing the 3-fold protrusions, were not cross-reactive despite the 94.3% structural identity that is shared between the GBoV1 and HBoV1 VP3 monomers [22]. VR-I, VR-III and VR-V are the most structurally divergent loops between HBoV1 and GBoV1 and contain residues outside the 15C6 and 12C1 epitopes (Figure 7c). These residues outside these epitopes, particularly at the 3-fold protrusion, are potentially responsible for this difference in antigenic reactivity with respect to 4G2 and 9G12.

**Figure 6.** GBoV1-Fab binding interfaces. (**A**) Close-up view of the GBoV1 WT structure docked to a generic Fab (PDB ID 2FBJ) within the cryo-reconstructed density of GBoV1-15C6 (represented as a gray mesh, contoured at 0.5σ) and (**B**) GBoV1- 12C1. Highlighted VRs are colored as shown in key. Generic Fab (dark brown) consists of a heavy and light chain, each with constant and variable regions. The Fab variable region interacts with the surface of the capsid. The GBoV1 capsid is also colored in tan.

**Figure 7.** The GBoV1 15C6 and 12C1 epitopes. (**A**) Roadmap surface representation of the GBoV1 15C6 epitope. Colored in orange are the modeled contact residues between the GBoV1 capsid and the 15C6 Fab model. Colored in yellow are the residues occluded by the bound 15C6 Fab. (**B**) Roadmap surface representation of the GBoV1 12C1 epitope. Colored in blue are the modeled contact residues between the GBoV1 capsid and the 12C1 Fab model. Colored in cyan are the residues occluded by the 12C1 Fab. (**C**) Position of VR-I, VR-II, VR-III and VR-V, VR-VIIIB on the GBoV1 capsid. Amino acid residues that are exposed on the capsid surface are labeled with their 3-letter code and residue number. The 5-fold, 3-fold and 2-fold axis are indicated by a filled pentagon, triangle and ellipse, respectively. The roadmaps were generated with the RIVEM program [44].

## *3.4. HBoV1 and GBoV1 Share Similar Rates of Seropositivity*

Forty human serum samples from adult donors were screened by native dot immunoblot against HBoV1, GBoV1, AAV2 and AAV5 capsids. Approximately 98.3% of the samples reacted to HBoV1 capsids, 88.3% against GBoV1, 25% against AAV2 and 14.2% against AAV5, respectively (Figure 8). This suggests that HBoV infections are prevalent in North America, the source of the analyzed human sera. This seroprevalence is comparable to results from a previous study analyzing sera from adults in Finland (95%) and Pakistan (99%) [50]. Nevertheless, seroprevalences are obtained without consideration of HBoV2-4 cross-reacting IgGs, so the true rates of HBoV1-specific and GBoV1-specific seropositivity will be skewed by potential HBoV2-4 IgGs. The high GBoV1 response is thus likely caused by anti-HBoV antibodies. In addition, generally weaker signal intensities were observed for GBoV1 compared to HBoV1 across the forty samples (Figure 8). The minor difference of 10% in seropositivity indicates variation in antigenic reactivity. Interestingly, AAV2 has been reported to be have a 72% seroprevalence (59% neutralizing) in French adults, a significant difference compared to the data here, suggesting location may be a large variable in these epidemiological studies [51]. As an example, another study reported 25–30% neutralizing antibodies against AAV2 when 100 human serum samples from North American adults were tested [52]. It is important to note that native dot immunoblots do not report the neutralization potential of the antibodies from the forty samples but rather the antibodies capable of recognizing the capsid surface. Further study of neutralizing factors and cross-reactivity within the human sera will be needed to determine the effect of such samples on the transduction efficiency in human cells and tissue.

**Figure 8.** Dot immunoblot analysis of HBoV1 and GBoV1 against human sera. (**A**) Representative native dot immunoblots of HBoV1 and GBoV1 against human sera with 1010 or 10<sup>9</sup> loaded capsid particles. AAV2 and AAV5 are used as controls. Samples tested are as labeled. (**B**) Bar graph representation of the percentage of positive signal based visual inspection of the 40 dot immunoblots reactivities. *n* = 3.

#### **4. Conclusions**

This study reports the first high-resolution structure of the GBoV1 WT capsid, resolved to 2.76 Å resolution. Compared to other members of the genus, the GBoV1 capsid shares similar surface features, such as the channel at the 5-fold symmetry axis, protrusions around the 3-fold axis, the 2/5-fold wall, located between the depressions at the 2- and 5-fold and depressions at the 2-fold as well as around the 5-fold axes. In addition to the high-resolution WT structure, the structures of two capsid-antibody complex structures are reported, highlighting antigenic epitopes on the GBoV1 capsid surface. Both, GBoV1 and HBoV1 share a high sequence and structural identity, with major structural differences localized to VR-I, VR-III and VR-V. VR-I and VR-III are both part of the 2/5-fold wall of the capsid and VR-V is located on the protrusions around the icosahedral 3-fold. These VRs contain residues that are within the epitopes of HBoV1 cross-reactive monoclonal antibodies 12C1, 4C2 and 9G12. All three antibodies share the same antibody epitope, as previously reported [27]. Interestingly, native dot immunoblots show that the GBoV1 capsid is capable of escaping 4C2 and 9G12 but not 12C1, suggesting that minor structural differences at these VRs are responsible for the GBoV1 capsid's ability to escape antibodies 4C2 and 9G12. Overall, the GBoV1 and HBoV1 capsid are antigenically similar at the icosahedral 5-fold axis, a region that is most conserved amongst parvoviruses, yet differ at the 3-fold axis. The reported capsid structures and epitopes can guide strategies for vector engineering and aid to develop the GBoV1 capsid structure as a viral vector. In addition, the HBoV1 and GBoV1 capsid seropositivity rates against human sera points to high cross-reactivity between the two viruses. Potential binding sites are likely the 5-fold region that is highly conserved. This, however, remains to be determined. Interestingly, the HBoV1 and GBoV1 seropositivity rates with human sera were significantly higher than those for AAV2 and AAV5 in the North American adult samples tested. This observation further emphasizes the need to understand the antigenic reactivity of the bocaparvoviruses if they are to be developed as vectors for clinical gene delivery.

**Author Contributions:** J.C.Y. was responsible for virus production and purification, native dot immunoblot analysis, cryo-reconstruction, structure refinement and analysis, model building and refinement and manuscript preparation. M.M. contributed to cryo-reconstruction, structure building, model building and refinement analysis and manuscript preparation. A.S. was responsible for aiding in virus production and purification. A.J.Y. contributed to the dot immunoblot analysis for the human sera. S.K. produced and purified HBoV4. J.F., D.G. and A.K. were responsible for plasmid design. I.K.-K. and A.Ž. developed the HBoV mAbs. M.S.-V. provided the H1-H1 antibody and contributed to interpretation of the results and manuscript preparation. P.C. vitrified sample and screened cryo-EM grids. N.B. collected cryo-EM data. R.M. and M.A.-M. conceived and designed this project, analyzed all results and contributed to manuscript preparation. All authors have read and agreed to the published version of this manuscript.

**Funding:** This project was possible due to the National Health Institute grant NIH R01 GM02946, the Sigrid Jusélius Foundation and the Life and Health Medical Foundation and the Cystic Fibrosis Foundation (CFF) grant GRIMM15XX0.

**Acknowledgments:** The authors thank the University of Florida (UF) Interdisciplinary Center for Biotechnology Research (ICBR) EM lab for providing negative-stain EM and cryo-EM services. We also thank Florida State University for providing cryo-EM data collection services. This research was made possible by the NIH grant R01 GM02946.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Review* **Concepts to Reveal Parvovirus–Nucleus Interactions**

**Salla Mattola 1, Satu Hakanen 1, Sami Salminen 1, Vesa Aho 1, Elina Mäntylä 2, Teemu O. Ihalainen 2, Michael Kann 3,4 and Maija Vihinen-Ranta 1,\***


**Abstract:** Parvoviruses are small single-stranded (ss) DNA viruses, which replicate in the nucleoplasm and affect both the structure and function of the nucleus. The nuclear stage of the parvovirus life cycle starts at the nuclear entry of incoming capsids and culminates in the successful passage of progeny capsids out of the nucleus. In this review, we will present past, current, and future microscopy and biochemical techniques and demonstrate their potential in revealing the dynamics and molecular interactions in the intranuclear processes of parvovirus infection. In particular, a number of advanced techniques will be presented for the detection of infection-induced changes, such as DNA modification and damage, as well as protein–chromatin interactions.

**Keywords:** parvoviruses; nucleus; imaging of viral interactions and dynamics; analysis of protein– protein interactions; analysis of virus–chromatin interactions

## **1. Introduction**

Parvoviruses are not only significant pathogens causing diseases in humans and animals but also promising candidates in gene therapy, in oncolytic therapy, in vaccine development, and as passive immunization vectors [1–7]. Compared to some other viruses that only need a few viral particles for infection, parvoviruses are extremely inefficient. In infection and disease development, this incapability is compensated by high replication. Finding new ways to treat parvoviral diseases and to facilitate the development of parvovirus-based therapies requires deepening the understanding of infection and propagation in their host cells.

Although parvoviruses and their infection have been extensively studied throughout the past decades, there is still a lack of molecular level understanding of the virus–host cell interactions. Due to their low particles to infectious unit ratio, the identification and tracking of virus-induced events, which contribute to viral propagation, is a key challenge. Furthermore, the small size of parvovirus (~20 nm in diameter) hinders the attachment of fluorescent probes, which limits capsid detection by single-virus tracking.

Parvoviruses are divided into two classes: autonomous parvoviruses, such as canine parvovirus (CPV), minute virus of mice (MVM), and rat parvovirus (H-1PV), and dependoparvoviruses, such as adeno-associated viruses (AAV), which require coinfection with either adenoviruses or herpes simplex virus in their late stages of infection [8]. Parvoviruses are composed of two to three capsid proteins (viral proteins, VPs; VP1, 2, and 3). They enclose a c. 5 kb-long ssDNA genome, which consists of two overlapping open reading frames. The expression is controlled by two promoters, the early P4 and late P38. The former guides the expression of viral nonstructural proteins 1 and 2 (NS1 and

**Citation:** Mattola, S.; Hakanen, S.; Salminen, S.; Aho, V.; Mäntylä, E.; Ihalainen, T.O.; Kann, M.; Vihinen-Ranta, M. Concepts to Reveal Parvovirus–Nucleus Interactions. *Viruses* **2021**, *13*, 1306. https:// doi.org/10.3390/v13071306

Academic Editor: Giorgio Gallinella

Received: 1 June 2021 Accepted: 2 July 2021 Published: 5 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

NS2), while the latter controls the expression of capsid proteins [9–11]. In the infectious virion, which has a diameter of 18–26 nm, the genome is covalently bound to the NS1 (Rep78 in AAV) protein [12–15]. This protein is cytotoxic and has central roles in viral replication attributed to its helicase, endonuclease, ATPase, and site-specific DNA-binding activities [16,17]. NS2 plays a role in viral replication [12,18], development of viral replication centres [19], viral mRNA translation [20], and the assembly [21] and nuclear egress of capsids [22–26]. In gene therapy, which is mostly based on AAV, the single-stranded genome is replaced by a double-stranded self-complementary genome, which does not allow replication [15].

After the cellular entry and cytoplasmic release, parvoviral capsids enter the nucleus through the nuclear pore complexes (NPCs) and/or via disruption of the nuclear envelope (NE) [27–34]. The VP1 capsid protein bears nuclear localization signals (NLSs) within its VP1-unique region in the N-terminal domain [35–41], which are thought to allow nuclear import by interaction with nuclear transport factors of the importin family [30,42,43]. In assembled capsids, this domain is hidden.

Once arriving in the nucleus, the genome replicates via a rolling circle mechanism, during which the genome concatemer is cleaved to monomers by NS1 [44]. The gene expression of parvoviruses is coupled to the S-phase of the cell cycle, and it leads to the formation of distinct replication centre foci where viral gene transcription and productive replication occur [19,45,46]. As the infection proceeds, the replication centres expand [27,28,47], which is accompanied by changes in the cellular chromatin structure and chromatin marginalization to the nuclear periphery at later stages of infection [45,47]. Besides the dramatic morphological changes, parvovirus infections are known to induce substantial damage to the host DNA [48–50], and MVM replication centres have been shown to associate with the sites of cellular DNA damage [51,52]. This allows the virus to recruit cellular DNA replication and DNA damage response proteins, which promote viral replication and gene expression [45,49,53]. NS1 of MVM is responsible for nicking the host DNA, which subsequently results in S phase cell cycle arrest [54]. However, during human parvovirus B19 (B19V) infection, a G2/M arrest is induced by the NS1 protein through a p53-independent pathway, which does not depend on the DNA damage response [50]. In addition to evoking disturbances in the cell cycle, parvoviruses are known to cause apoptosis of the infected cells, another hallmark of DNA damage [55,56].

These nuclear changes are followed by progeny capsid assembly in the nucleus, which is combined with the encapsidation of viral genomes covalently bound to NS1. The progeny virions leave the cell by lysis, probably after export from the nucleus [57–60]. This lytic viral release, in conjunction with the S-phase-dependent replication, enables the use of autonomous parvoviruses in oncotherapy for the destruction of rapidly dividing cancer cells [61].

#### **2. Imaging of Viral Interactions and Dynamics in the Cytoplasm and Nucleus**

To date, a broad variety of microscopy-based imaging and spectroscopy applications have enlightened the steps in the early infection of several parvoviruses (Figure 1). Upon nuclear import, CPV can pass the NE [27,28,62,63], which was confirmed by single-particle tracking analyses of fluorophore-labelled AAV capsids (Figure 1, boxes 1 and 2) [64]. Similar analyses have also been used to study the receptor binding of canine parvovirus [65,66] as well as the cytoplasmic trafficking [67] and nuclear import of AAV [27,28,64,68].

The schematic represents the fluorescent microscopy methodology for the imaging of the parvoviral life cycle in the nuclear region. (1) Analysis of fluorescent virus particle dynamics by single-particle tracking and high-speed super-resolution microscopy verified the import of viral capsids through the nuclear pore complex. Image correlation analysis using the pair correlation function (pCF) revealed the importin β-mediated nuclear transport of capsids. Confocal microscopy combined with EM characterized an alternative nuclear entry pathway for parvoviruses through virus-induced nuclear envelope ruptures. (2) Tracking of fluorescent capsids after their nuclear entry demonstrated that

they moved by diffusion in the nucleoplasm. Furthermore, image correlation using the autocorrelation function (ACF) indicated that the capsids were disintegrated after their nuclear import. (3) Super-resolution microscopy analysis indicated that viral replication centres were located close to sites of cellular DNA damage. Fluorescence recovery after photobleaching (FRAP) studies showed that infection affected the diffusion of nuclear proteins, such as transcription-associated proteins. (4) Fluorescent tagging of progeny capsids (green) has allowed for analyses of capsid dynamics in living cells. Images were created with BioRender.com.

**Figure 1.** Imaging of viruses in the nucleus of infected cells.

Imaging of autonomous parvovirus capsids has partially been hampered by the limited possibilities to express recombinant viruses that contain fluorescent proteins, as the enlarged genome size leads to poor viral genome packaging. Therefore, little is known about virus–nucleus interactions following the assembly of viral capsid. However, AAV-2 studies have shown that large peptides can be inserted into the VP2 protein with a minimal effect on viral assembly or infectivity [69]. This has allowed the creation of fluorescent protein-tagged AAV particles for live cell analysis of intranuclear dynamics [70]. The loop regions of AAV capsid proteins exposed to the capsid surface have been used for the insertion of shorter peptides, which enables the labelling of viral particles with a fluorescent dye [71,72].

Tracking of individual viruses is a powerful tool to examine the mechanisms of their intracellular transport, and it is straightforward, for example, to conclude whether the motion is directed or random diffusion. For active processes, such as transport along microtubules, the dynamics can be deduced from a low number of particles. However, insight into the parvoviral life cycle has revealed the diffusive dynamics of events. For example, following of the trajectories of Cy5-labelled AAV capsids in the cytoplasm and nucleus showed that the majority of capsids move by regular diffusion, but a smaller fraction of the capsids exhibits anomalous subdiffusion [64]. The analysis of a small number of randomly moving diffusing particles is challenging, but when the motions of typically hundreds or thousands of particles are averaged, their movement can be characterized. The mean squared displacement (MSD) of the particles follows the law *MSD* = 2*dDt*, where *D* is the diffusion coefficient of the particle, *d* is the dimensionality of the motion, and *t* is the time. Measuring the MSD allows for the determination of the particle diffusion coefficient, which can then be further connected to the particle radius *r*, temperature *T*, and viscosity *η* of the medium by the Stokes–Einstein equation:

$$D = \frac{k\_B T}{6\pi\eta r}$$

.

Recently, image correlation spectroscopy has been used to verify the nuclear capsid import and intranuclear disassembly of capsids in living cells (Figure 1, boxes 1 and 2) [30]. Image correlation methods are based on the principles of fluorescence correlation spectroscopy (FCS), which measures fluctuations of fluorescence intensity in a small volume by using the focused excitation laser beam. The recorded fluctuations in photon counts, collected as a time series, are used to calculate the time autocorrelation function (ACF) to resolve the dynamics of fluorescently tagged proteins. The ACF represents the correlation of the fluorescent signal between the starting time point (t = t0) and following time points (t = t0 + Δt) of the experiment, thus yielding information on fluorescent molecule diffusion time in the focal spot. In parvovirus studies, the ACF calculated for a time series of laser scanning microscopy images containing temporal information of the intensity fluctuations and spatial distribution maps of the fluorescent viral particles has enabled the analysis of fast and slow diffusion, or even immobile viral particles [30].

To obtain more information about the possible directed movement of fluorescent particles, pair correlation function (pCF) analysis can also be used. The pCF measures the correlation over time and space and thus can distinguish directed movement or obstacles to diffusion. In parvovirus studies, pCF revealed a positive correlation between pixels across the NE within an image series, thereby demonstrating the nuclear import of capsid through the NE [30,73–75]. In addition, pCF analysis detected a spatiotemporal correlation between the fluorescent viral capsid and importin β, suggesting that importin β mediates capsid translocation through the nuclear pore complex [30]. An alternative or parallel existing nuclear entry pathway has been derived from studies using fluorescence and electron microscopy. The experiments have demonstrated that the NE undergoes substantial damage at early times during parvovirus H1, CPV, and AAV2 infection, indicating an NPC-independent nuclear entry of capsids [31,33].

The theoretical nuclear diffusion coefficient of capsids obtained from the Stokes– Einstein law, assuming that the viscosity of the nucleoplasm is approximately four times higher than in water [76,77], is in the order of 10 μm2/s. This is in accordance with the experimental finding of 5 μm2/s obtained for the mobile population of virus-like particles of parvovirus [30,47]. In the cellular scale, this is a relatively fast diffusion rate, and it means that on average, the virus particles are able to diffuse a 10 μm distance in a time scale of a few seconds, when not restricted by physical barriers or by interactions.

Studies of nucleoplasmic capsid diffusion coefficients by ACF, which improved temporal resolution from the millisecond to microsecond scale, have revealed distinct diffusion dynamics for intact capsids and potential capsid fragments, suggesting that capsids are disintegrated in the nucleoplasm after their import [30]. The detailed mechanisms by which

the viral genome is released into the nucleoplasm remain to be determined. However, fluorescence microscopy analyses have shown that capsids are already modified prior to nuclear import and nuclear disassembly when VP1 N-terminus is exposed during the endocytic entry [41,78–80]. According to immunoprecipitation analyses, B19V capsid uncoating is enhanced by cytoplasmic divalent cations [81]. Previously published studies have demonstrated that at least for MVM, the nuclear release of DNA occurs without a complete disassembly of the capsids [78,82–85]. In summary, it can be concluded that parvoviral capsids enter the nucleus either via NPC or by passing through transient holes in the NE, which allow the entry of intact capsids. Intact capsids entering the nucleus may undergo structural change which leads to viral genome release at some distance from the NE [30,86].

As outlined before, progressing parvovirus infection leads to the development of viral replication centres [46,87] and relocation of host chromatin to the nuclear periphery [45,47–49,88]. Recently, super-resolution microscopy has demonstrated that viral replication centres originate close to DNA damage sites (Figure 1, box 3) [52]. The introduction of photobleaching experiments in the analyses of intranuclear mobility and kinetics of viral and cellular proteins has allowed a better monitoring of nuclear changes upon parvoviral infection (Figure 1, box 3). In these studies, a high-intensity laser is used to photobleach the fluorescence of a fluorescent molecule, typically a fluorescent fusion protein, from a defined area of the cell. In fluorescence recovery, after photobleaching (FRAP), a region of interest is bleached, and the recovery of fluorescence in the bleached region is measured. The rate of fluorescence recovery is determined by the exchange of fluorescent molecules between the bleached region and the surrounding unbleached area, thereby allowing the analysis of protein dynamics and interactions. In fluorescence loss in photobleaching (FLIP), an area of the cell is continuously photobleached with laser pulses, and images taken between the pulses measure the response in the entire pool of fluorescent molecules. Similar to FRAP, the rate of fluorescence loss is related to the mobility of the fluorescent molecules.

In CPV infection, FRAP experiments (Figure 1, box 3) have revealed that the dynamics of transcription-associated protein change during infection [89] and further demonstrated that infection leads to an increased protein mobility in the nucleoplasm, which potentially alters protein–protein and protein–DNA binding reactions during viral replication [47]. Additionally, FRAP has been used to study the kinetics of NS1-EYFP in noninfected cell nuclei. The results have shown that NS1-EYFP mobility is not consistent with free diffusion and suggested transient binding to nuclear components [90]. Shown by FLIP, the nucleocytoplasmic shuttling of NS1-EYFP has been discovered [90].

Further central questions in the late stages of the nuclear life cycle of parvoviruses, such as capsid assembly and nuclear egress, have been addressed using fluorescent microscopy of immunostained cells. These studies, in combination with biochemical characterizations, showed that MVM capsids assemble in the nucleus from VP1/VP2 trimers [60,91], and these trimers expose a structured nuclear localization motif [58]. For AAV-2, the subcellular localization of capsid assembly to nucleoli was identified with immunofluorescence and in situ hybridization microscopy techniques. Viral genome sequence analysis and mutational studies revealed that the capsid assembly is mediated by the viral assembly associated protein (AAP) [92,93]. Moreover, X-ray crystallography and cryo-EM analyses of MVM capsids demonstrated that viral DNA is packed through a fivefold packaging channel [94,95]. Studies have also revealed that MVM capsids leave the nucleus prior to cell lysis and NE breakdown [96], suggesting that capsids have to exit the nucleus through the NPCs [22,23]. A similar combination of techniques was used to show that MVM capsids egress the nucleus dependent upon chromosomal region maintenance 1 (CRM1, also known as exportin 1) protein [96], which is a nuclear export factor for various proteins and different cellular RNAs (snRNA, rRNA, some mRNAs) [97]. Notably, the nuclear exit was limited to genome-containing capsids phosphorylated in the unordered domain of VP2, while empty capsids exhibited nuclear accumulation [96]. By combining classical immunofluorescence microscopy with surface plasmon resonance spectroscopy, it has been shown that the CRM1-dependent nuclear export of MVM capsids is mediated by the supraphysiological NES in NS2 [22].

#### **3. Screening and Validation of Protein–Protein Interactions**

The nuclear import of intact parvovirus capsids is not limited by the NPC diameter, which is able to transport particles with a diameter of ~39 nm [98]. There is accumulating evidence that the nuclear entry of the parvovirus capsid depends on the host machinery for nuclear import, requiring coordinated interaction with different host proteins. Earlier studies have shown that the capsid proteins of MVM and CPV, in addition to AAV capsids, have basic regions containing NLSs or a structured nuclear localization motif in their capsid proteins. [35–41,60,79] During endocytic entry, the acidification of capsid leads to NLS exposure, and after reaching the cytoplasm, this would thus allow the attachment of nuclear import factors. Studies including coimmunoprecipitation assays (Co-IP) have verified that CPV and AAV2 capsids interact with Imp β [42,99]. However, these assays elucidate neither the localization of the interaction in the cell environment nor the phase of the infection. The proximity ligation assay (PLA) has allowed comprehensive imaging and quantitation of interactions within the host cell. This antibody-based technique enables the detection of two proteins that are in close proximity to each other (~40 nm) [100]. Therefore, PLA is capable of visualizing protein–protein interactions beyond the diffraction limit (Figure 2A). For CPV, in situ proximity ligation analysis, combined with confocal microscopy and image analysis, has demonstrated that capsids are able to recruit cytoplasmic Imp β for nuclear transport [42]. Coimmunoprecipitation analyses have indicated that entering H-1PV and AAV2 capsids interact with nucleoporins, which are proteins of the NPC [31].

**Figure 2.** Analyses of protein–protein interactions in infection. Schematic overviews of proximity ligation assay (PLA) and proximity-dependent biotin identification (BioID) methods to identify and localize interactions between viral and host proteins. (**A**) The schematic representation of PLA assay. (1) Primary antibodies are used to target proteins of interest shown in red and green. (2) Secondary antibodies with PLA oligonucleotide probes bind to the primary antibodies. (3) Closely located PLA probes are ligated together, and (4) the formed circular DNA is amplified. (5) The amplified DNA (red) is labelled by fluorescent probes (green). (6) Confocal microscopy image shows the intracellular distribution of the PLA signals (green). Nuclei were stained with DAPI (grey). (**B**) Outlines of the BioID workflow. (1) Transfection of cells with BirA\*-viral protein-fusion constructs and the generation of a stable inducible cell line. (2) Addition of biotin to the culture media and viruses if infection is required. (3) Cell culture period during which biotin ligase activity of BirA\* fusion protein induces proximity-dependent biotinylation of neighbouring endogenous and viral proteins. (4) Cell lysis and the streptavidin-affinity purification of biotinylated proteins from cell lysates. (5) Mass spectroscopy and analyses of protein associations. (6) Interaction network indicating interaction partners of viral protein and biological processes involved. Images were created with BioRender.com.

Knowledge of viral protein interactions with cellular proteins is essential for understanding the intranuclear processes such as viral replication, capsid assembly, and nuclear egress. Affinity purification-mass spectrometry proteomics approaches have been traditionally used to analyse protein–protein interactions in infection [101–103]. Recently, many new screening methods have been generated to recognize protein–protein associations [104–106]. One of the methods is the proximity-dependent biotin identification (BioID) assay combined with mass spectrometry [107–109] (Figure 2B). BioID is a proximitytagging method that utilizes a fusion of promiscuous biotin ligase, BirA, to a protein of interest to identify protein–protein associations and proximate proteins. The working radius for biotinylation via BirA is 10–40 nm, depending on the used application. Mass spectrometry-based proteomics applications such as BioID are able to recognize highly transient protein–protein interactions during the viral lifecycle. BioID studies of parvovirus human bocavirus 1 (HBoV1) have revealed interaction between viral nuclear protein 1 (NP1) and factors mediating nuclear import and mRNA processing [110]. A BioID analysis of AAV2 Rep proteins has revealed their association with cellular proteins, such as the transcriptional corepressor KAP1, which assist the viral genome in resisting epigenetic silencing, thereby allowing the lytic replication of AAV [111]. BioID has also been used to recognize interactions between viral proteins and DNA damage-related proteins. BioID has revealed an AAV Rep protein interaction with the Mre11 part of the MRN complex, an important initiator of the AMT response [111]. Overall, BioID has allowed for identifying associations of the viral protein of interest in a wide variety of nuclear processes, which, for CPV NS2, include DNA damage response and chromatin modification [112].

#### **4. Detection of DNA Damage, DNA Repair, and Virus–DNA Interactions**

Progression of parvovirus infection depends upon the induction of a cell cycle arrest and cell lysis. It leads to the activation of DNA damage response (DDR) [19,45], which promotes the infection and viral reproduction [113,114]. Ataxia telangiectasia and Rad3(ATR)-mediated DDR activation is linked to replication fork stalling, whereas the activation of the Ataxia-telangiectasia mutated (ATM)-mediated route is the initial response to a double-stranded DNA break (DSB) [115,116]. The activation of the ATR route has been observed for MVM, B19, and HBoV1 [51,111,112], and the ATM route for MVM, HBoV1, and AAV [45,117–119] (Figure 3A). Recognition of DNA damage induces the recruitment of proteins responsible for DNA damage repair to the site of the damage. During parvovirus infection, the emergence of DNA damage can be observed either indirectly by the accumulation of DDR proteins to the damage site or by observing the formation of actual DNA breakages. MVM infection has been shown to cause accumulation of proteins of the ATM signalling route (e.g., phosphorylated H2AX (γ-H2AX), Nbs1, RPA32, Chk2, p53, MDC1, MRN) to the replication start sites together with the viral replication protein NS1 [19,45]. During viral replication, at least newly synthesized viral DNA is bound to RPA, a known activator of ATR [120]. However, in MVM infection, this does not lead to the full activation of the ATR response since checkpoint kinase1 (Chk1) is not activated [49,51] (Figure 3A).

Recently, a high-throughput viral chromosome conformation capture sequencing assay (V3C-seq) has been applied to study the association of MVM viral genomes with host chromatin [121] (Figure 3C). V3C-seq is based on the chromosome conformation capture sequencing technology (3C-seq) [122] used to study chromosome arrangement in the nucleus by crosslinking the sites of genomic associations and identifying these regions with sequencing. 3C-seq studies have revealed that MVM genomes become associated with DNA damage sites during early stages of infection [121]. These sites of DNA damage with associated viral genomes increase as the infection proceeds. Nuclear localization of this association was further verified with fluorescent in situ hybridization (FISH) and super-resolution stochastic optical reconstruction microscopy (STORM). The introduction of externally induced DNA damage sites with laser irradiation or with CRISPR-Cas9 to a specific genomic locus resulted in parvoviral genome association with these regions. V3C-seq analyses have also revealed that the viral genome association sites and DNA

damage sites overlap with self-interacting genetic regions, also known as topologically associating domains (TADs) [52]. Recently, it has been shown that the localization of viral genomes to the DNA damage sites is mediated by viral NS1 [121].

**Figure 3.** Approaches revealing virus-induced DNA damage. The schematic diagram of diverse methods for the analyses of DNA damage response (DDR), viral and host DNA interactions, and DNA damage in infection. (**A**) Analyses of ATM and ATR-mediated DNA damage signalling pathways by confocal and super-resolution microscopy, ATM-mediated cellular response to DNA damage functions through phosphorylation of proteins related to DNA damage and DNA damage repair such as γ-H2AX, MDC1, Rad50, Nbs1, and Mre11. In MVM infection these proteins are found in replication start sites together with viral NS1. In parvovirus-infected cells, the ATR-mediated response depends on RPA and viral NS1 interaction. (**B**) Elucidation of interactions between viral genome and host cell chromatin by using high-throughput viral chromosome conformation capture sequencing assay (V3C-seq). Moreover, association of DNA damage site MVM genomes has been shown by ChIP-seq. This analysis has been used to verify the association between NS1-mediated viral genome replication and DDR. (**C**) Studies of host cell chromatin disintegration by comet assay. Images were created with BioRender.com.

Classical DNA damage analyses in viral infection are qPCR or agarose gel electrophoresis, which do not allow investigations on the single-cell level. This obstacle was solved by comet assay—also known as single-cell gel electrophoresis—which is a sensitive, quantitative, and relatively simple imaging-based method to observe DNA breakages (Figure 3C) [123–125]. Scraped or trypsinized cells are cast into low-density agarose gel and lysed, after which the remaining nucleoids are placed in an electric field and stained. DNA lesions, both single and double stranded, result in a relaxation of DNA supercoiling. The relaxed DNA loops migrate towards the positively charged pole during electrophoresis, forming the characteristic comet tail pattern. The relative DNA content in the comet tail versus the head thus reflects the number of DNA lesions. Unlike the various DDR pathway markers, which might be activated in response to viral genomes or proteins [126], this method relies on the physical properties of damaged host DNA. Comet assay studies and ChIP-seq analysis have demonstrated that MVM infection causes host DNA damage, which increases as the infection proceeds [52]. In contrast, the comet assay has revealed no significant DNA damage in cells infected by the bocavirus minute virus of canine [127], nor in cells infected by human B19V [127]. The potential nucleolytic activity of parvoviral NS1 protein against host DNA has been investigated in expression studies for HBoV1 [117]

and human B19V [127], but these studies did not find significant host DNA damage in NS1-expressing cells.

To benefit from host cell responses such as the DDR, viral proteins or viral genomes are required to interact directly with DNA or DNA-modifying proteins. The interactions of cellular DNA-binding proteins and viral proteins with host chromatin and viral genomes in MVM and CPV infections have been studied by ChIP-seq methods [52,88,121]. These studies have shown the acetylation of histones bound to CPV genome and MVM genome association with cellular γ-H2AX sites and the viral NS1 protein [52,88,121] (Figure 3B). Furthermore, the studies of the genomic reactivation of latent AAV genome by ChIP and ChIP coupled to qPCR have revealed the mechanism by which cellular proteins induce viral genome repression [111].

#### **5. Recent Methods for Future Studies of Parvovirus–Nucleus Interactions**

Despite of decades of research, many detailed mechanisms of virus–host interactions are not well understood, and many new observations raise further questions, requiring the use of newly developed techniques. Next-generation sequencing (NGS) and fluorescence imaging technologies are currently advancing rapidly [128,129], offering excellent opportunities for detailed analysis of infection-induced changes in the host chromatin organization and high-resolution imaging of parvovirus infection. For example, these methods combined with spatial transcriptomics allow analyses of the spatial heterogeneity of the gene expression within the sample [130–133].

NGS is a modern sequencing methodology where massive parallel sequencing is used to map the sequences of millions of small DNA fragments. Bioinformatics is then used to combine the acquired sequencing data, which can be then compared to reference genome(s). Various approaches allow for obtaining information about expressed genes [134], genome accessibility [135], binding regions of different DNA interacting proteins [136–138], or chromatin–chromatin interactions and organization [139]. As an example, the assay for transposase-accessible chromatin with sequencing (ATAC-seq) is based on hyperactive Tn5 transposase mutants [135]. In this assay, the hyperactive Tn5 is used to tagment the accessible chromatin by conjugating short and specific DNA oligomers into the accessible regions. These regions of the genome are then isolated and sequenced, yielding a highresolution map of the accessible regions of the genome. Thus, ATAC-seq has great potential in studies on how parvoviral infection changes the host cell chromatin organization or in studies of viral genome packaging or release. This is exemplified by recent results showing that baculovirus infection induces significant changes in the organization of host genome, such as an increase in chromatin accessibility, relocation close to the NE, and nucleosome disassembly [140]. Moreover, ATAC-seq analysis of Epstein–Barr virus (EBV), a member of the herpesvirus family, has demonstrated that B cell chromatin undergoes significant remodelling during infection, which leads to the regulation of cell cycle, apoptosis pathways, and interferon regulatory factors [141]. Another example of a similar DNA-tagging method is DNA adenine methyltransferase identification (DamID)-sequencing [142]. Here, DNA adenine methyltransferase (Dam) is fused to a protein of interest, and this fusion protein is expressed in cells. The Dam enzyme recognizes DNA sequence GATC and methylates the adenine in the close vicinity of the fusion protein. These methylated regions of chromatin can then be sequenced and mapped. Thus, these sequences correspond to the chromatin that has been in close vicinity to the expressed fusion protein. This DamID-seq has been used to map the chromatin interacting with the nuclear lamina and lamina-associated domains [143]. In addition to sequencing, both ATAC-seq and DamID-seq can be combined with high-resolution fluorescence imaging. In the case of ATAC-seq, fluorescent oligomers are used together with hyperactive Tn5, and therefore, the tagmented and accessible chromatin can be visualized by fluorescence microscopy. This ATAC-see method [144] allows imaging the accessible chromatin regions and would be directly applicable to parvoviral studies regarding host cell chromatin or viral genome organization. DamID can be used together with methylated DNA-recognizing fluorescent m6A-tracer fusion protein. m6A-tracer binds to the GATC sequence when adenine is methylated by Dam methylase. By fusing m6A-tracer to a fluorescent protein, the fluorescent signal localizes to the methylated DNA [145]. The great advantage of the DamID m6A-tracer system is the possibility to use it in living cells. Thus, one can follow the chromatin dynamics by live cell microscopy. We envision that the system could be used to follow parvovirus infection-induced dynamic reorganization of the host genome.

Imaging and sequencing approaches are directly combined in spatial transcriptomics, where transcriptomes are resolved by high resolution microscopy or by capturing, so that spatial information about the location is also recorded. In microscopy-based spatially resolved transcriptomics or genomics, the different RNA and DNA species are labelled via sequential fluorescence in situ *hybridization* and barcoding. This approach offers the highest resolution, and recently, the imaging of 3660 chromosomal loci together with 17 chromatin marks in single cells has been reported [146].

## **6. Concluding Remarks**

Conventional confocal microscopy approaches, including the imaging of fluorescent viral capsids and proteins and their interplay with cellular components within the host cell, have been successfully used in parvovirus studies. The development of live cell imaging and super-resolution microscopy, combined with image data analysis, together with the development of new screening tools for analyses of protein–protein and DNA– protein interactions, has further enhanced our understanding of virus–nucleus interactions and the nuclear dynamics of infection. In the near future, combining fluorescence data and ultrastructural information from electron micrographs will allow answering detailed questions regarding the mechanisms of intranuclear events in viral infection. Moreover, the advances in super-resolution microscopy applications will enable us to probe cell–virus interactions and dynamics in previously unattainable detail.

**Author Contributions:** S.M., S.H., S.S., V.A., E.M., T.O.I., M.K. and M.V.-R. wrote the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Jane and Aatos Erkko Foundation (M.V.-R.), Academy of Finland, grant numbers n330896 (M.V.-R.), 308315 (T.O.I.), 314106 (T.O.I.), and 332615 (E.M.); the Biocenter Finland, viral gene transfer (M.V.-R.); the Graduate School of the University of Jyvaskyla (S.M.); and a starting grant of the University of Gothenburg (M.K.).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data available in a publicly accessible repository.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Review* **The VP1u of Human Parvovirus B19: A Multifunctional Capsid Protein with Biotechnological Applications**

## **Carlos Ros \*, Jan Bieri and Remo Leisi**

Department of Chemistry and Biochemistry, University of Bern, 3012 Bern, Switzerland; jan.bieri@dcb.unibe.ch (J.B.); remo.leisi@dcb.unibe.ch (R.L.)

**\*** Correspondence: carlos.ros@dcb.unibe.ch

Academic Editor: Giorgio Gallinella Received: 2 December 2020; Accepted: 16 December 2020; Published: 18 December 2020

**Abstract:** The viral protein 1 unique region (VP1u) of human parvovirus B19 (B19V) is a multifunctional capsid protein with essential roles in virus tropism, uptake, and subcellular trafficking. These functions reside on hidden protein domains, which become accessible upon interaction with cell membrane receptors. A receptor-binding domain (RBD) in VP1u is responsible for the specific targeting and uptake of the virus exclusively into cells of the erythroid lineage in the bone marrow. A phospholipase A2 domain promotes the endosomal escape of the incoming virus. The VP1u is also the immunodominant region of the capsid as it is the target of neutralizing antibodies. For all these reasons, the VP1u has raised great interest in antiviral research and vaccinology. Besides the essential functions in B19V infection, the remarkable erythroid specificity of the VP1u makes it a unique erythroid cell surface biomarker. Moreover, the demonstrated capacity of the VP1u to deliver diverse cargo specifically to cells around the proerythroblast differentiation stage, including erythroleukemic cells, offers novel therapeutic opportunities for erythroid-specific drug delivery. In this review, we focus on the multifunctional role of the VP1u in B19V infection and explore its potential in diagnostics and erythroid-specific therapeutics.

**Keywords:** parvovirus B19; B19V; VP1u; receptor; PLA2; virus entry; erythroid cells; biomarker; drug delivery; nanocarrier

## **1. Introduction**

The *Parvoviridae* is a family of nonenveloped viruses that packages a linear, single-stranded DNA genome (~5 kb) within a small (~25 nm) icosahedral capsid. As a direct consequence of their limited coding potential, parvoviruses are particularly dependent on host cellular factors for their replication [1,2]. Parvoviruses are widely spread in nature and their host range might span the entire animal kingdom [3]. Depending on their host, members of the family *Parvoviridae* are subdivided into the subfamilies *Parvovirinae*, infecting vertebrates and *Densovirinae*, infecting insects and other arthropods. Viruses that infect vertebrates, including humans, are further divided into the dependoparvoviruses and the autonomous parvoviruses [4]. The dependoparvoviruses replicate only in the presence of a helper virus, such as adenovirus or herpesvirus. The adeno-associated viruses (AAVs) are not linked with any known pathology, have a wide tissue specificity, and replicate in dividing and nondividing cells. These properties make AAVs useful gene transfer vehicles for therapeutic applications [5]. Although autonomous parvoviruses use similar strategies for cell entry and replication, they differ substantially in their pathogenic potential, which ranges from subclinical to severe or even lethal infections [2]. As autonomous parvoviruses can only replicate in dividing cells, when the host cell DNA replication machinery becomes available, they tend to cause more severe infections in young than in adult hosts.

While most ssDNA viruses show a circular genome structure, parvoviruses have a linear genome, that is typically organized in two open reading frames (ORFs). The ORFs are flanked by palindromic sequences of variable length, which fold into hairpin structures and are essential for replication [6,7]. The 5- ORF (ns or rep gene) encodes for the regulatory nonstructural protein(s) required for viral DNA replication and packaging. The 3- ORF (cap gene) encodes two to four variants of a single capsid protein (VP). Following a principle of genetic economy, the different VPs are generated by alternative splicing or alternative codon usage, but also by post-translational proteolytic processing during entry, resulting in a common C-terminal sequence but different N-terminal extensions of variable length [8–10]. The T = 1 icosahedral parvovirus capsid is assembled from 60 VPs, however, the number of N-terminal VP variants used to assemble the infectious particles varies from two (VP1 and VP2) to four (VP1–VP4) depending on the genus. The VP variants are numbered in order of length, with VP1 being the largest variant. The common C-terminal region of the VPs forms the capsid shell, which consists of a conserved alpha-helix and a jelly roll motif containing eight antiparallel β-strands. The different configurations of the loops connecting the conserved β-strands delineate the surface topology, which is characteristic to each parvovirus genus and define the virus tropism and antigenicity [11]. Despite low sequence identity, the parvovirus capsids display structural features that are conserved across different genera, i.e., a narrow depression at the twofold axis of symmetry, protrusions of variable size and shape at the threefold axis and a canyon-like structure encircling a cylindrical pore at the fivefold axis connecting with the interior of the capsid.

The minor protein VP1 has an N-terminal extension of variable length, the so-called VP1-unique region (VP1u), and is present at 3 to 10 copies per virion depending on the parvovirus genus. VP1u is not required for virus assembly but contains several essential motifs required for the infection. Nuclear localization signals (NLSs) consisting of a stretch of basic amino acids have been identified in the VP1u from several parvoviruses. These motifs were shown to confer nuclear import potential to the incoming particles [12–19]. Another motif found in VP1u, except for amdoparvoviruses, is a phospholipase A2 (PLA2) enzyme domain, which enables viruses to escape from endosomal vesicles into the cytosol during cell entry [20–26]. Other motifs in VP1u were found to be essential for the infection. In AAV these motifs include signals that are known to be involved in protein interaction, endosomal sorting, and signal transduction in eukaryotic cells [27]. In B19V, a receptor-binding domain (RBD) required for virus uptake was identified at the N-terminal of the VP1u [28].

To infect the cell, parvoviruses follow an intricate path from the cell surface to the nucleus where they deliver the viral DNA for replication. During the process of entry, the incoming parvovirus capsids undergo a program of conformational rearrangements triggered by specific cellular factors that facilitate their intracellular transport [29,30]. A major capsid rearrangement that is largely conserved among parvoviruses involves the externalization of the VP1u region. Initially sequestered in mature virions, VP1u and its essential motifs become accessible at the particle surface during entry triggered by the acidic endosomal environment [10,16,31]. Besides low pH, AAVs may require additional cellular factors [31]. An exception is B19V, whose VP1u becomes accessible during the initial interactions with cellular receptors [32]. VP1u exposure occurs through the five-fold channel that connects with the interior of the capsid [33–36]. Structural and in vitro studies suggest that these channels serve not only as portals for the externalization of N-terminal capsid protein sequences but also for the packaging and release of the viral genome [16,31,37–41]. Mutations that perturb the functional structure of the channel result in defective genome encapsidation, uncoating and VP1u externalization [38,42,43]. B19V represents again an exception as the VP1u seems to be already exposed on the capsid surface, although in a conformation that is not accessible to antibodies [44].

This review focuses on the VP1u of B19V, which shares common aspects with other parvoviruses but has unique features, like its structural conformation relative to the virion, immunodominance, extraordinary length or the presence of a receptor-binding domain responsible for the restricted tropism

of the virus. We present the current knowledge on the different VP1u motifs, their functions in the virus infection and the potential biotechnological applications of the B19V VP1u in human therapy and diagnostics.

## **2. Human Erythroparvovirus B19 (B19V)**

B19V is the most prominent and well-characterized human pathogen within the *Parvoviridae* causing a mild childhood rash disease named *erythema infectiosum* or fifth disease [45]. The infection is often asymptomatic; however, in adults, B19V infection may induce a wide range of more severe pathological conditions, such as arthralgias and arthritis [46]. B19V infection may lead to aplastic crisis in patients with pre-existing bone marrow disorders and shortened red cell survival [47] and persistent infection in immunocompromised persons. Infection during pregnancy may result in *hydrops fetalis* and fetal death [48]. B19V was the first parvovirus known to cause disease in humans [49]. Since 2005, other human parvoviruses have been identified and include human bocavirus (HBoV1-4), parvovirus 4, bufavirus, tusavirus and cutavirus. Except for HBoV, which has been implicated in acute respiratory tract infections [50], the rest are emergent human parvoviruses with uncertain clinical significance [45,51].

B19V is transmitted via aerosol droplets that come into contact with the upper respiratory tract mucosa [47]. The virus crosses the mucosal epithelium through a yet unknown mechanism and disseminates with the bloodstream to the bone marrow, where it infects erythroid precursors at a particular erythropoietin (EPO)-dependent stage of differentiation [52–54]. The extraordinary narrow tropism of B19V is mediated at different levels of the viral life cycle. Crucial steps of the viral infection, such as uptake, genome replication, transcription, splicing and packaging, are restricted to the EPO-dependent erythroid differentiation around the proerythroblast stage [54–60]. The lytic replication cycle results in the destruction of the erythroid precursor cells [61,62], which accounts for the hematological syndromes observed during the infection [47]. Acute infection frequently results in high-titer viremia, which precedes the onset of clinical manifestations and has been associated with B19V transmission through transfusion and plasma-derived medicinal products [63].

#### **3. B19V Capsid**

The ssDNA genome of B19V is packaged into a small, nonenveloped, T = 1 icosahedral capsid. Similar to the genome of dependoparvoviruses, the B19V genome has two identical inverted terminal repeats (ITRs; ~383 nt), which serve as the origin of replication [64]. The capsid consists of 60 structural subunits of two N-terminal VP variants, VP1 and VP2. Approximately 95% are VP2 (major VP; 60 kDa) and 5% are VP1 (minor VP; 86 kDa) [65]. VP1 and VP2 are generated through alternative splicing, resulting in the same C-terminal sequence but VP1 contains 227 additional residues at the VP1 N-terminal region, the so-called VP1 "unique region" (VP1u). The 60 protomers form 20 trimeric capsomers in the cytoplasm of the infected cell, which are assembled to an icosahedral capsid structure in the host nucleus. Due to the T = 1 symmetry, all protein subunits can be assembled in the same orientation to each other. This perfect symmetry enables an optimal thermodynamic sink for each protomer interaction, forming a very stable capsid around the ssDNA genome.

Large-scale propagation of native B19V is not possible due to the lack of a fully permissive cell culture system. Accordingly, structural studies have been performed with recombinant B19V-like particles, which are similar, although not identical, to infectious native capsids. The structure of the VP2 recombinant particle has been determined to ~3.5 Å resolution [36]. Similar to other parvoviruses and many icosahedral viruses, the major capsid protein VP2 is structured as a "jelly roll" with a β-barrel motif. The loops connecting the strands of the β-barrel define the capsid surface topology that differentiates B19V from other parvoviruses. B19V lacks the prominent protrusions at the icosahedral threefold axes characteristic in other parvoviruses. The channel at the fivefold icosahedral axis is surrounded by a large canyon-like depression. Different from other parvoviruses, the channel in B19 VP2 capsids is constricted at its outside end. However, a cluster of glycine residues at this position may confer sufficient flexibility to open the channel upon specific cellular triggers during the infection. A striking difference between B19V and other parvoviruses is the external position of the N-VP2 and probably also VP1u [36,66]. Accordingly, the role of the fivefold channel in B19V would be limited to the externalization and packaging of the viral genome.

## **4. VP1u Is the Immunodominant Region of the Capsid but It Is Not Accessible in Native Virions**

Although VP1u may occupy a surface position in the B19V capsid, different regions of the protein were shown to be inaccessible to antibodies. However, exposure of native capsids to heat or low pH rendered these regions accessible without capsid disassembly. In contrast to native virions, VP1u is always accessible in recombinant B19V-like particles [44]. The inaccessibility of VP1u in native virions is not well understood and may be explained by a compact structural conformation, or by the presence of a masking structure hiding the essential protein domains to the immune system. Despite the non-accessible conformation of VP1u in native particles, this protein is the immunodominant part of the capsid and contains clusters of critical neutralizing epitopes [67,68] (Figure 1).

**Figure 1.** Schematic depiction of the neutralization profile and functional domains in the VP1u of B19V. The neutralizing profile revealed a cluster of important epitopes in the N-terminal region of the VP1 corresponding to functional domains. RBD; receptor-binding domain. PLA2; phospholipase A2 domain.

Typically, neutralizing antibodies prevent the viral infection by interfering with early steps of the viral life cycle, i.e., attachment to cellular receptors, uptake, fusion, or conformational changes required for entry [69,70]. Importantly, the inhibition by neutralizing antibodies should be distinguished from the opsonization of viruses by antibodies, which can hamper the viral infection by immobilization of the virus and subsequent degradation of the immune complex by the complement system, immune cells or also by the cytoplasmic TRIM21/proteasome mechanism [71]. However, the specific targeting of essential capsid protein domains by neutralizing antibodies is required to efficiently interfere with the viral infection. Upon B19 viremia, the humoral immune response first generates IgM antibodies, which predominantly target the major capsid protein VP2. With the class-switch and long-term immunity, an increasing percentage of B lymphocytes secrete neutralizing antibodies against the VP1u region [72]. In this regard, a deficient immune response to VP1u has been associated with persistent infections, emphasizing the important role of the immune response against VP1u in clearing the virus [73,74].

Immunization experiments with vaccine candidates based on virus-like particles (VLPs) demonstrated that VP1u is essential to raise a strong neutralizing response against B19V [75,76]. However, the neutralization mechanism of antibodies targeting VP1u has remained largely elusive. Being the immunodominant region of the capsid, the originally inaccessible VP1u should become exposed in the extracellular milieu, and not inside endosomes, as shown for other parvoviruses. In line

with this assumption, it has been shown that a neutralizing antibody against the N-terminal part of the VP1u was unable to bind native cell-free virions but was able to block virus entry into susceptible cells. Moreover, capsids without VP1u were unable to internalize into susceptible cells, demonstrating the involvement of the VP1u in B19V uptake [32,54]. These findings explain the high neutralization potential of VP1u antibodies, which target exclusively capsids during the initial interaction with cell receptors and block virus uptake, and further emphasize the importance of VP1u as an essential component of prospective B19V vaccines.

## **5. Role of VP1u in the Restricted Tropism of B19V**

B19V has a remarkable narrow tropism. The virus shows productive infection exclusively in erythroid precursor cells at EPO-dependent intermediate erythroid differentiation stages, with increasing permissiveness from BFU-E to erythroblasts [53]. Viral tropism can be determined already at the cell surface by the expression of specific cell receptors required for virus entry and/or intracellularly by receptor-independent post-entry replication steps. The marked erythroid tropism of B19V is determined at multiple steps, i.e., the receptor-mediated uptake, genome replication, transcription, splicing and packaging [56,57,77]. A virus requiring such strict intracellular conditions for replication would also require a selective mechanism of cell entry to target exclusively the few cells where the virus can replicate. This strategy would allow the virus to avoid internalizing non-permissive cells, which would lead to abortive infections and inefficient viral propagation. Accordingly, it would be expected that B19V uses an erythroid-specific surface molecule as an entry receptor.

## *5.1. VP1u Contains a Receptor-Binding Domain That Is Essential for Virus Entry into Permissive Cells*

The neutral glycosphingolipid globoside (Gb4), also known as P antigen, has long been considered the primary receptor of B19V [78]. A large body of evidence suggests that B19V recognizes Gb4 and that the interaction is required for the infection [79–82]. However, the wide-range Gb4 expression does not correlate well with B19V binding and uptake and cannot either explain the pathogenesis or the remarkable narrow tissue tropism of the virus [83]. By using a knockout cell line, we demonstrated that Gb4 does not have the expected function as the primary cell surface receptor required for B19V entry. Instead, Gb4 has an essential role at a post-entry step after virus uptake and before the delivery of the viral genome into the nucleus for replication [84]. Other receptor molecules, such as α5β1 integrin [85] and Ku80 autoantigen [86] have been proposed as potential coreceptors for B19V infection. However, the restricted uptake of B19V does not correspond with their wide expression profiles.

In an earlier study, we showed that the VP1u harbors a receptor-binding domain (RBD), which enables the uptake of the virus. Purified recombinant VP1u (recVP1u) was able to bind and to internalize exclusively into B19V permissive cells. Moreover, incorporation of VP1u subunits on bacteriophage VLPs by chemical coupling enabled their internalization into B19V permissive cells (Figure 2) [59]. The VP1u cognate receptor has not yet been identified, but its expression profile corresponds with the restricted tropism of B19V, being expressed exclusively in cells at erythropoietin-dependent erythroid differentiation stages [54,59].

## *5.2. Mapping and Structural Characterization of the Receptor-Binding Domain in the VP1u*

The receptor-binding domain (RBD) in the VP1u was identified by using recVP1u variants with increasing N- and C-terminal truncations. The VP1u variants internalized normally when they were truncated less than 5 AA at the N-terminus or less than 147 AA at the C-terminus. Longer truncations at both ends decreased or blocked VP1u uptake [28]. According to these results, the RBD spans the region between AA 5–80 of VP1u, which explains the detectable exposure of this domain on the surface of susceptible cells before uptake [32,87], as well as the presence of a cluster of neutralizing epitopes [67,68].

**Figure 2.** Binding and internalization of recombinant MS2-VP1u labeled with Atto-488. (**A**) Schematic depiction of MS2-VP1u particles. (**B**) Confocal fluorescence microscopy of MS2-VP1u bound to UT7/Epo cells at 4 ◦C and internalized at 37 ◦C. MS2-Atto488-Δ126 (100 N-terminal AA of VP1u); MS2-Atto488 (without VP1u).

The secondary structure analysis of the N-terminal of VP1u (AA 1–80) from natural B19V isolates, predicted a cluster of three α-helices with high confidence: helix 1 (AA 14–31), helix 2 (35–45), and helix 3 (59–68) (Figure 3A). However, only helix 1 was conserved among other erythroparvoviruses (Figure 3B) and displayed a prominent amphiphilic character. The marked segregation of polar and hydrophobic amino acids between the two opposite flanks of the α-helix is well suited for receptor binding. Compared with the residues of the hydrophilic side, the amino acids of the hydrophobic side were highly conserved (Figure 3C). Point mutations on the hydrophobic side blocked VP1u binding and internalization, suggesting a critical role of these residues in the interaction of VP1u with its cognate cellular receptor [28].

The sequence analysis of the first 80 amino acids of VP1u predicted two additional helices (Figure 3A). Disruption of the tertiary conformation of these domains by the introduction of flexible sequences strongly impaired VP1u internalization. This observation suggests that the spatial configuration of the three helices is crucial for VP1u binding to its cognate receptor and subsequent uptake. An ab initio modeling of the RBD by the QUARK algorithm [88] predicted a helix-like spatial configuration of the three helices (Figure 4A,B), where a cluster of conserved and internalization-relevant amino acids was modeled in close proximity (Figure 4C,D) [28]. The spatial proximity of function-relevant residues may correspond to a critical receptor-interacting site.

## *5.3. VP1u Cognate Receptor Facilitates B19V Targeting and Uptake Exclusively into Permissive Cells*

B19V requires a strict intracellular environment for replication that can only be found in the erythroid progenitor cells (EPCs) in the bone marrow. The essential intracellular factors appear to be simultaneously upregulated in EPCs during the EPO-dependent differentiation stages. A study has shown that the internalization and the replication of B19V are considerably enhanced when CD34+ hematopoietic stem cells were stimulated with EPO [57]. Not surprisingly, the two cell lines that are most frequently used to study B19V infection, the megakaryocyte-erythroid UT7/Epo cells [89] and the erythroleukemic KU812Ep6 cells [90], are both derived from an EPO-dependent subclone. EPO signaling maintains the survival of cells that entered the intermediate erythroid differentiation stages [91–93]. Besides EPO signaling, B19V infection requires hypoxic conditions, which characterize the bone marrow microenvironment where the virus replicates. Hypoxia upregulates

the signal transducer and activator of transcription 5 (STAT5) pathway, which facilitates viral DNA replication [61,94,95]. During the EPO-dependent differentiation stages, a cluster of erythroid-specific genes is upregulated [96], including the VP1u cognate receptor [59], which jointly are essential for B19V replication. In this regard, the main role of the VP1u receptor would be to facilitate the targeting and the uptake of B19V exclusively into cells providing a permissive intracellular environment for the infection. This strategy prevents the internalization into non-permissive cells, which would result in abortive infection.

**Figure 3.** Structural motifs within the VP1u RBD. (**A**) Amino acid sequence of the N-terminal of VP1u (AA 1–80) and the three predicted alpha helices (underlined red). (**B**) Conservation of alpha helix 1 among erythroparvoviruses. (**C**) Modeled helical wheel of the conserved helix 1 (AA 14–31) shows the spatial arrangement of hydrophobic and polar amino acids within helix 1. Amino acid differences found in B19V isolates are shown in a wider radius. Hydrophobic = orange; polar = green; basic = blue; acid = red. The helical wheel of the simian parvovirus helix 1 is shown.

**Figure 4.** Ab initio modeling of the RBD in the VP1u. (**A**) Front and side views. Helix 1 appears in blue (AA 14–31), helix 2 in yellow (AA 35–45), helix 3 in red (AA 57–68). (**B**) Helix distribution and sequence of the modeled AA 14–68. The amino acids required for VP1u internalization are colored in orange (hydrophobic) and green (polar). (**C**) The spatial distribution of essential amino acids is shown as spheres in the helical structure (**D**) and in the surface model of the RBD.

## *5.4. Evolutionary Aspects of B19V Restricted Tropism and the Origin of the RBD in the VP1u*

The origin of the marked tropism of B19V for erythroid precursors in the bone marrow is not known. The erythroparvovirus and dependoparvovirus genomes show striking similarities, both having identical hairpin telomeres at both sides, and related replication mechanism [97]. It is conceivable that the erythroparvovirus ancestor was dependent on helper virus co-infections. In line with this hypothesis, several studies observed the enhancement of B19V replication and gene expression in non-permissive cells in the presence of helper virus genes [98,99]. Besides the enhanced genome replication, adenovirus genes transactivated the B19V promoters, including the p44 promoter in the middle of the genome (nt 2247), which is normally silenced during B19V infection [100,101]. The p44 promoter is homologous to the promoters that initiate the expression of the structural capsid proteins in other parvoviruses. Interestingly, the expression of the structural proteins represents a limiting factor in B19V infection in non-erythroid cells. The transcription of the structural genes from the p44 promoter might have played an important role in the helper-dependent ancestors of erythroparvoviruses but could have been replaced during evolution by an alternative helper-independent replication in erythroid cells. In contrast to most other parvoviruses, B19V shows alternative splicing in the transcript from the p6 promoter that also enables the expression of the distal genes [102]. This exceptional splicing mechanism of B19V, which strikingly occurs only in EPCs, makes the internal and helper-dependent p44 promoter dispensable. Interestingly, there is another putative internal promoter (p55) at nt 2308 that might have similar properties.

According to this evolutionary model (Figure 5), the erythroparvovirus ancestor would have generally exhibited a helper virus-dependent replication in different tissues, and sporadically, a helper-independent replication in EPCs. However, without a specific targeting and internalization into the erythroid progenitor cells, the overall infection still depended on the helper virus co-infection. The erythroid-specific transcription of the structural genes from the p6 promoter generates a transcript with a longer 5- -UTR that possibly allows the displacement of the start codon of the capsid proteins and consequently, a longer VP1u region (Figure 6). The additional N-terminal amino acid sequence, expressed only during the helper-independent infection in erythroid cells, might have evolved to the RBD in the VP1u. The erythroid-specific targeting boosted the infection in the EPCs and thus represented a positive feedback loop that promoted the autonomous replication in the erythroid tissue. Vice versa, the helper-independent replication was the driving force for the positive feedback mechanism. The positive feedback enhanced the reliance on additional erythroid-specific factors and thus finally led to the extreme tropism of erythroparvoviruses.

**Figure 5.** Proposed origin of the marked erythroid tropism of B19V (see text for details).

**Figure 6.** Infection-relevant functional domains in the structural proteins of three representative parvoviruses. B19V exhibits a longer VP1u region compared to other parvoviruses. The additional N-terminal stretch of 80–90 amino acids contain the functional RBD [28] and a cluster of neutralizing epitopes [67,68]. NLS, nuclear localization signal.

The helper-dependent replication and expression of B19V genes in non-erythroid cells might still represent a significant aspect of the pathogenesis of B19V infection. The unspecific entry of the virus into non-erythroid tissues would not necessarily end in an abortive infection. The internalization of B19V during the late viremic phase by the antibody-dependent enhancement (ADE) provides a basis for latent infections in diverse tissues. These latent viruses might be sporadically reactivated by helper virus infections, which would explain many of the B19V-associated diseases as well as the recurrent detection of B19V DNA in the serum and different tissues [99,103].

## **6. Role of VP1u in the Subcellular Tra**ffi**cking of Incoming B19V**

To infect the cell, parvoviruses follow a complex route from the plasma membrane to the nucleus where they replicate. Various domains in the VP1u of parvoviruses have been shown to play a critical role in the process by assisting the transport of the incoming capsids throughout the different membrane-enclosed organelles and the highly crowded cytosol and by promoting their translocation through the nuclear pore complex (NPC) into the nucleus (Figure 6).

## *6.1. The Phospholipase A2 (PLA2) Domain*

Following the interaction of the VP1u RBD with its cognate receptor, B19V is internalized by clathrin-mediated endocytosis and enters the endosomal pathway [104]. Endosomes provide cues that trigger capsid conformational rearrangements required for subsequent trafficking steps and contribute to the transport of incoming viruses to the nuclear vicinity. However, the mechanism followed by parvoviruses to escape from endosomal vesicles into the cytosol remains unclear. Phospholipase A2 enzymes (PLA2s) catalyze the hydrolysis of phospholipids and the release of lipid mediator precursors. Accordingly, PLA2s are key enzymes in many cellular processes such as lipid membrane metabolism, inflammation, membrane remodeling, host defense, and signal transduction [105]. PLA2s are found in mammalian tissues as well as in arachnids, insects, mollusks, reptiles, plants, and bacteria. A PLA2 domain containing the typical catalytic motif HDXXY and the calcium-binding site GXG is conserved in the VP1u of parvoviruses, including B19V (except for amdoparvoviruses) [21–23,25,106]. Mutations in either of these motifs disturbed both, the enzymatic activity and viral infectivity [20,21,23]. The pharmacological disruption of endosomal membranes or co-infection with endosomolytically active adenovirus, but not with inactive variants, partially rescued the infectivity of the PLA2 mutants, suggesting a role of the VP1u PLA2 in altering the endosomal membrane integrity to enable endosomal

escape of viruses into the cytosol [24,26]. In B19V, VP1u mutations not related to the critical PLA2 motifs were also shown to reduce the enzymatic activity, probably by disrupting the three-dimensional rearrangement surrounding the PLA2 domain [107]. Although the PLA2 may facilitate the endosomal escape of incoming B19V, the mechanism involved is poorly understood. Interestingly, B19V endosomal escape was shown to occur without detectable endosomal membrane permeabilization or damage [104] and the enzymatic requirements for the PLA2 activity, i.e., pH and calcium concentration are not optimal in the endocytic compartment [106]. Accordingly, it remains unclear how the PLA2 activity of VP1u supports the escape of the endosomal capsids.

B19V PLA2 has been shown to up-regulate Ca2<sup>+</sup> entry [108], to inhibit Na+/K<sup>+</sup> ATPase activity and K<sup>+</sup> channels [109,110], and to up-regulate ENaC [111]. These activities may contribute to the pathophysiology of B19V infections. Moreover, due to the inflammatory-like effects exerted by recombinant VP1u in cultured fibroblast [112] and in UT7/Epo cells [107], it has been hypothesized that the PLA2 domain of VP1u may contribute to B19V-associated syndromes, such as arthropathy and autoimmunity.

#### *6.2. Nuclear Localization Signals (NLSs)*

Following endosomal escape, parvovirus capsids are imported into the nucleus. Nuclear import of most proteins involves classical nuclear localization signals (NLSs) consisting of a stretch of basic amino acids, which interact with importin-α/importin-β to mediate transport through the nuclear pore [113]. The size of the parvovirus capsid is below the diameter limit of the nuclear pore complex (NPC). Therefore, capsids can theoretically be translocated through the NPC intact or without major disassembly [114]. NLSs have been identified in the VP1u from several parvoviruses [12–19] and confer nuclear import potential to the incoming particles via interaction with importin-β [115]. The VP1u of B19V does not display a motif resembling a classical NLS, however, when expressed in eukaryotic cells, VP1u accumulates in the cytoplasm and in the nucleus [107]. The stretch of basic amino acids found in VP2 occupies an internal position in the capsid and has been implicated in the nuclear translocation of assembly intermediates [116]. Accordingly, the mechanism of nuclear import of B19V remains uncertain and might differ fundamentally from that of other parvoviruses.

## **7. Biotechnological Applications of the VP1u of B19V**

Nanocarriers are designed to efficiently deliver therapeutic molecules to specific tissues minimizing adverse effects [117]. Despite important progress, the drug delivery technology based on synthetic nanocarriers remains highly inefficient. One meta-analysis revealed that over 99% of the drugs do not reach the diseased cells and accumulate instead in non-target tissues or are cleared from the body [118]. Ideally, nanocarriers must specifically internalize into the target cells, escape from the endocytic compartment, and release their payload into the cytosol. These processes resemble the early infection steps of viruses, which operate as powerful natural nanocarriers to efficiently deliver genetic material into target cells by complex mechanisms shaped by evolution. The targeting machinery that is engaged in the early viral infection steps can be utilized to generate virus-inspired nanocarriers as efficient drug or gene delivery vehicles [119,120]. In this regard, the VP1u of B19V includes many interesting features that can potentially be exploited for drug delivery and diagnostics, i.e., specific cell targeting, efficient cell entry, and endosomal escape.

#### *7.1. Specific Biomarker for EPO-Dependent Erythroid Di*ff*erentiation Stages*

Diverse hematological conditions (e.g., leukemia, thalassemic and myelodysplastic syndromes, bone marrow metastases of solid tumors, septicemia, or severe health conditions after surgery) are typically associated with the presence of erythroblasts outside the bone marrow [121–126]. Accordingly, the screening of peripheral blood for nucleated red blood cells (NRBCs) is used to recognize hematological disorders or severe health conditions. Assays to detect NRBCs must be very sensitive because the presence of only a few NRBCs can indicate serious underlying disorders. Unfortunately, automated hematology analyzers may not detect low levels of NRBCs. Besides, they generate suspect flags, which should be examined manually [127]. The currently used automated detection of NRBCs in peripheral blood has a detection limit of 1-2 erythroblasts per 100 white blood cells [123,128]. In comparison, VP1u decorated MS2 capsids were able to detect as few as one erythroleukemic UT7/Epo cell in 100,000 isolated white blood cells (unpublished observations). The sensitive identification of erythroblasts in the peripheral blood by fluorescent VP1u bioconjugates has the potential to improve the detection of diverse hematological disorders or severe health conditions and to facilitate an early diagnosis without the systematic need of an invasive technique such as bone marrow biopsy.

The precise identification and isolation of erythroid progenitor cells is important in hematological research and in diagnostics to characterize and treat bone marrow disorders. However, the technique remains rather complex and laborious, since the currently used markers are not lineage-specific (CD36, CD38, CD44, CD45, CD71, CD105, EPOR) or are broadly expressed during the erythroid development (glycophorin A). Therefore, the combination of several antibodies is necessary to achieve the correct identification [124,129–134]. In contrast, the fluorescent VP1u bioconjugate appeared as a unique and highly sensitive marker for the EPO-dependent erythroid differentiation stages and readily detected these cells in heterogeneous cell populations from different tissues [54]. The findings show the potential of the VP1u as a biomarker to identify and sort erythroid differentiation stages in a simpler procedure than it has been practiced so far.

It is expected that the future biotechnological applications of the VP1u will be spurred by the identification of its cognate receptor. However, the identity of the VP1u receptor will not necessarily be determinant for the applicability of the VP1u as a specific cellular marker. Historically, it is not uncommon to use cell surface markers to identify cell populations based on empirical evidence without knowing the identity and/or the function of the targeted receptors.

## *7.2. Specific Drug Delivery and Chemotherapy*

## 7.2.1. β-Hemoglobin Disorders

β-hemoglobin disorders are a group of highly prevalent hereditary diseases caused by mutations in the gene encoding for the β-chain of hemoglobin, resulting in qualitative and quantitative defects in β-globin production. β-thalassemias are a heterogeneous group of genetic disorders characterized by the partial or complete absence of β-globin chain production, leading to anemia and iron overload. The disease is highly prevalent with 80–90 million carriers worldwide. Without diagnosis and appropriate treatment, the severe forms of β-thalassemia lead to death before age 20 [135]. Sickle cell disease (SCD) is the most common and severe hemoglobinopathy. In SCD, a single mutation in the β-globin gene results in the production of an aberrant hemoglobin molecule, which causes the rigid sickle-like shape of erythrocytes. Without treatment, SCD is lethal before age five [136].

Patients with severe β-hemoglobin disorders require regular blood transfusions, which lead to iron overload and related complications. Accordingly, iron chelation therapies are also required [137,138]. The most severe forms of the disease have been successfully treated by allogeneic hematopoietic stem cell transplantation from a matched related donor. However, major drawbacks are the difficulty to find a histocompatible donor and the need for extensive immunosuppressive regimens, with the risk of immunological complication. Besides, this approach is not accessible for many affected individuals [139,140]. Gene therapy and gene editing strategies to restore the globin genes have generated promising results. However, these approaches lack cell-specific vectors, resulting in poor efficiency and the risk of insertional oncogenesis [141–143].

Due to the numerous drawbacks associated with the current therapeutic strategies, there is a great interest in developing novel therapeutic options. The therapeutic targeting of RNA by double-stranded RNA-mediated interference (RNAi) or by antisense oligonucleotides (ASOs) allows specific inhibition of the target of interest and a very rapid transferability to the clinics [144]. However, the delivery of nucleic acid molecules to the bone marrow remains highly inefficient. The MS2 capsid is a well-studied vector for drug delivery and can be easily loaded with therapeutic ASOs or small interfering RNAs (siRNA) [145,146]. This strategy provides protection of the therapeutic nucleic acid molecules in the extracellular milieu, avoids solubility problems, and thus allows more the options to improve the modifications of the oligonucleotides. In a previous study, we showed that anchoring of VP1u subunits to the surface of MS2 capsids retargets the particles to erythroid cells. This finding offers the opportunity to deliver encapsidated genetic material specifically to this cell population [59]. Potential targets of therapeutic ASOs or siRNA might be different factors involved in the regulation of erythropoiesis, such as transferrin receptor 2, or regulatory elements of fetal hemoglobin, such as B-cell lymphoma/leukemia 11A and erythroid Kruppel-like factor. Specific downregulation of such factors in erythroid progenitor cells would significantly alleviate symptoms of β-hemoglobin disorders [147–150].

#### 7.2.2. Erythroleukemia

Acute erythroleukemia is a rare disorder associated with a poor prognosis. A study reported a median overall survival of 8 months [151]. The treatment of erythroleukemia is compromised due to the systemic distribution and resistance of the malignant cells to chemotherapeutics [152,153]. Therefore, the successful elimination of erythroleukemic cells by a cytotoxin requires a "magic bullet" strategy—an efficient and specific targeting of the toxin to cancer cells—minimizing adverse effects to the surrounding healthy cells [154]. Erythroleukemias exhibit proliferating cancer cells in the early and intermediate erythroid differentiation stages [155], which are the target cells of the VP1u. Accordingly, the VP1u-mediated toxin delivery represents a possible strategy to overcome the resistance of erythroleukemia to chemotherapeutics. In previous studies, VP1u successfully targeted a toxin specifically to malignant erythroid precursors and thus selectively eliminated these cells from a mixed cell culture [156].

The immunity of many individuals against B19V would represent a serious obstacle for the application of the VP1u-targeted delivery. About half of the human population is seropositive for anti-B19V antibodies. Similar problems are faced in the application of AAV vectors for gene therapy, where many individuals have antibodies against serotypes 2 and 3 [157]. Following the natural mechanism of viruses to evade the immune system, the AAV researchers are searching for AAV isolates and isotypes, which are not neutralized by the common pool of antibodies, but still offer the beneficial properties of the original virus [158–160]. In the case of a short protein with a single function as with the RBD of the VP1u, an immune escape by antigenic drift is easier to achieve without disturbing the receptor binding and internalization capacity. The natural mutations observed in various B19V isolates (Figure 3C) together with the mutational studies already performed [28], provide an excellent basis to mimic an antigenic drift of the VP1u RBD without decreasing the targeting function of the protein. Furthermore, there exist different options to reduce the antigenicity of a therapeutic protein, such as a fusion with an abundant endogenous protein as serum albumin or the immunoglobulin constant fragments [161–163]. The coupling to these endogenous proteins does not only circumvent the immune response, but also considerably increases the solubility, stability, and serum half-life of the therapeutic proteins. In line with this concept, bovine serum albumin (BSA) was used as an adaptor molecule for the attachment of the toxins to a VP1u-NeutrAvidin complex. The results showed that the modified BSA remained soluble after the attachment of 20–30 fluorescein or toxin molecules to the protein and was targeted exclusively to VP1u-expressing cells. The stability of the drug attachment might be increased by packing the effector molecule into a capsid, as shown with the MS2 bacteriophage in previous studies [145,146,164]. The specific delivery of an encapsidated effector allows a higher dose per delivered particle without increasing toxicity. Besides, the capsid can be engineered to incorporate multiple residues to improve the targeting efficiency.

## **8. Concluding Remarks**

The VP1u is a key component of the capsid of human parvovirus B19 with essential functions in multiple steps of the infection, such as tissue tropism, uptake, intracellular trafficking, and entry. The VP1u is also the immunodominant region of the capsid and a crucial component for prospective vaccines. In the future, efforts will be focused to better understand the essential functions of VP1u in B19V infection and to identify the VP1u interactome, notably its cognate cell receptor.

Recent innovations in protein engineering and nanomaterials science have the potential to revolutionize the conventional methods of diagnosis and treatment, bringing new hopes to patients. However, to date, a major barrier in their clinical application remains their poor selective targeting. Only a few clinically approved nanoscale delivery vehicles integrate molecules to selectively target the cargo to the tissue of interest. In this regard, the remarkable erythroid specificity of the VP1u offers novel opportunities to generate virus-inspired biomarkers and nanocarriers to specifically target erythroid cells. This approach may contribute to a better understanding of the mechanisms governing erythroid development and to treat disorders of the erythroid lineage. Efforts to circumvent the VP1u immune response and to optimize the stability and density of cargo delivery will facilitate its transferability to human diagnostics and therapies.

**Author Contributions:** Conceptualization, R.L. and C.R.; writing—original draft preparation, C.R.; review and editing, R.L., J.B. and C.R.; Supervision, funding acquisition, C.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Swiss National Science Foundation (grant number 31003A\_179384) to J.B.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
