**2. Hepatitis B Virus, Classification, and Gene Products**

HBV is a hepatocyte-tropic virus and is assigned to the family of hepatitis DNA viruses, *Hepadnaviridae* [11–13]. HBV is divided into 10 main genotypes, A–J, which differ by more than 8% at the nucleotide level [14,15]. The HBV genome has a size of approximately 3.2 kilobases (kb), and is represented by a relaxed circular, partially double-stranded DNA (rcDNA), which is delivered to the nucleus of the host cell and converted into a covalently closed circular DNA (cccDNA) molecule [12]. The cccDNA represents a non-integrated stable episome and forms the template for all viral RNA transcripts. In the absence of an origin of replication site required for DNA-dependent DNA amplification, one of the viral transcripts, the pre-genomic RNA (pgRNA), serves as the template for replication to generate rcDNA via reverse transcription [12]. HBV contains four open reading frames (C, P, S, and X) and encodes seven proteins (polymerase, X protein, HBcAg, HBeAg, HBsAgL, HBsAgM, and HBsAgS). The polymerase is essential for several steps in the replication pathway through its reverse transcriptase, RNaseH, and priming activities. The X protein supports efficient infection and replication in vivo [11,12,16]. The core protein (HBcAg) constitutes the subunit of the viral capsid and is essential for the formation of virions. The e-antigen (HBeAg) is derived from the pre-core protein by proteolytic processing and is not part of the viral capsid. It is involved in modulating the host immune response against HBV and represents an important serological marker [11–13]. The virus encodes for three related surface (envelope) proteins (HBsAg) that share a common S-domain. They are translated from different in-frame start codons and hence are distinguished by their N-terminal extensions. The small HBsAg (HBsAgS) comprises only the S-domain with a size of 226 amino acids (aa), the middle HBsAg protein (HBsAgM) has an N-terminal extension of 55 aa (pre-S2 domain), and the large HBsAg (HBsAgL) has an additional extension of 108 or 119 aa (preS1-domain) depending on the genotype [17] (Figure 1A,B). In addition to the classification by genotypes, HBV is distinguished by four main serotypes based on the reactivity against HBsAg. All genotypes have a common serotypic reactivity against a major antigenic site called the "a"-determinant, but further express two mutually exclusive allelic antigenic determinants "d" or "y" and "w" or "r" [18–20]. The antigenic determinants of HBsAg are located in an exposed loop region of the S-domain. HBsAg and antibodies against HBsAg (anti-HBs) are important serological markers. The loss of HBsAg and seroconversion to anti-HBs antibodies are a sign of immunity and recovery from acute or chronic hepatitis B [13].

Characteristic of a HBV infection is the generation of a large quantity of HBsAg SVPs and filaments devoid of capsid and of the viral genome. SVPs exceed the presence of infectious virions in host sera by a factor between 102 and 105 [17,21–24]. SVPs are predominately composed of HBsAgS, and their presence in the sera does not seem to interfere with HBV particle entry into hepatocytes, suggesting that SVPs represent decoys by binding to virus-neutralizing antibodies [25]. HBsAgS SVPs share important immunological determinants with the mature virus, and therefore, SVPs derived from patient serum or recombinant SVPs represent effective immunogens for the induction of a protective immune response [26–28]. Vaccinated individuals develop antibodies targeting the "a"-determinant region, which provides protection against the infection of all HBV serotypes [20,29]. The discovery and characterization of "empty" genome-free virions containing HBsAg and capsid is reviewed by Hu and Liu, 2017 [30].

**Figure 1.** The surface (envelope) proteins (HBsAg) of hepatitis B virus (HBV): (**A**) The open reading frame encoding the complete hepatitis B surface antigen is depicted. The domain organization of preS1, preS2, and S, with the number of amino acids (aa) of the individual domains are specified. The four transmembrane regions (TM1–4) are indicated by the thick black lines. The function of the different domains in relation to their orientation towards the lumen of the endoplasmic reticulum (ER) or cytosol is indicated. (**B**) The individual HBsAg open reading frames for the small (HBsAgS), middle (HBsAgM), and large (HBsAgL) proteins, and their post-translational modifications are shown. The size of the HBsAgS protein is indicated by the number of amino acids. Arrows represent the utilized glycosylation sites. Red arrows mark asparagine 146 (N146) in the S-domain. Orange and purple arrows represent the N-4 and threonine 37 (T37) respectively, in the preS2 domain. The glycine residue at position 2 (G2) of the preS1 domain, indicated by a purple line, is myristolated. The observed molecular weights (MW) of the glycosylated (gp) and non-glycosylated proteins (p) separated under reducing conditions on a SDS-PAGE are indicated on the right. (**C**) Design of the RTS,S vaccine that is produced by co-expressing HBsAgS (aa 1–226) and the chimeric protein that is a fusion of the circumsporozoite polypeptide (210–398 aa) to the N-terminus of HBsAgS, including 4 aa from the preS2 domain.
