**1. Introduction**

Hepatitis C (HCV) infection remains a major public health problem, even if in the last few years HCV therapy has been improved by the availability of direct-acting antiviral (DAA) agents [1]. Phylogenetic analyses have identified eight HCV major genotypes, further subdivided into 67 subtypes [2]. HCV1b is widespread all over the world, HCV2 showed higher prevalence in Russia and in Italy. In Europe, the most common HCV2

**Citation:** Marascio, N.; Costantino, A.; Taffon, S.; Lo Presti, A.; Equestre, M.; Bruni, R.; Pisani, G.; Barreca, G.S.; Quirino, A.; Trecarichi, E.M.; et al. Phylogenetic and Molecular Analyses of More Prevalent HCV1b Subtype in the Calabria Region, Southern Italy. *J. Clin. Med.* **2021**, *10*, 1655. https:// doi.org/10.3390/jcm10081655

Academic Editor: Yasuhito Tanaka

Received: 2 March 2021 Accepted: 8 April 2021 Published: 13 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

subtypes are HCV2a/2c. HCV1a and HCV3a predominate in Europe and North America, while HCV4 is endemic in the Middle East, Central Africa and Mediterranean countries. HCV5 is endemic in South Africa, HCV6 in South East Asia and HCV7 was found in the Democratic Republic of Congo. Recently, HCV8 was found in Indian patients living in Canada [2–4]. Magiorkinis and colleagues reported a massive expansion of HCV1b infections between 1940 and 1980, sustained by blood transfusion and iatrogenic procedures [5]. In Europe, HCV1b was predominantly found in females and associated with births not later than 1958 [4]. Its prevalence is decreasing due to improved health standards [6]. Interestingly, HCV1b was predominant in Japan, Italy, and Spain with a high prevalence in patients with hepatocellular carcinoma [7]. Since 1997, HCV1b has been the most prevalent subtype in the Calabria Region reflecting national data [3,8].

The major prevalence worldwide and the low susceptibility to Interferon (IFN) or pegylated-IFN alfa with ribavirin (pegIFN-α/RBV) therapies made HCV1b the first target for the development of new antiviral drugs [9]. Currently, direct-acting-antiviral (DAA) pan-genotypic therapy can be used to treat infected people without the need for determining the genotype/subtype or performing a resistance test [1]. Pretreatment assessment should consider the presence of cirrhosis and comorbidity in view of post-therapy followup. However, after considering the data of DAA efficacy in a clinical setting, combination therapy still appears to be influenced by resistance-associated substitutions (RASs) carried in target regions in naïve or experienced patients [10].

In this study, we investigated transmission clusters in two cohorts of HCV1b positive subjects, enrolled in different time spans, to assess the dynamics of infection in the Calabria Region, southern Italy. In particular, more recent isolates were evaluated for the presence of mutations with a potential impact on treatment response.

#### **2. Materials and Methods**

#### *2.1. Study Population*

The study was approved by the Ethical Committee (#100; 27 April 2017) of the "Mater Domini" University Teaching Hospital of Catanzaro, Italy and it was included in the SINERGIE study [11]. The Ethical Committee approved the criteria that there is no need for informed consent for a non-interventional study. Forty-one serum samples, collected between 1 January 2015 and 31 December 2016, from patients infected by HCV subtype 1b were included in the analysis. Enrolled patients, attending the University Hospital of Catanzaro, were randomly selected from a list through a systematic 1:7 sampling procedure. The selected sample is representative of the whole HCV1b cohort, including 41.7% of males versus 54.0% of females with an overall median age 68 (31–84) years [12]. Patients were naïve to all treatments (25/41) or treated with IFN (3/41) and pegIFN-α/RBV (13/41). Additionally, only viral isolates from HCV1b positive subjects, collected between May and October 2010 during a previous epidemiological study in Calabria, were included in order to compare and investigate phylogenetic relationships with those from Catanzaro. All participants were resident in a small village, Sersale (Catanzaro province) [13]. The patients' clinical data was treated in accordance with the Helsinki Declaration (59th World Medical Association General Assembly, Seoul, Korea, October 2008) and the principles of good clinical practice.

#### *2.2. Diagnostic Procedures*

HCV RNA viral load was determined using the Cobas AmpliPrep/Cobas TaqMan HCV quantitative test v2.0 (Roche Diagnostics, Milan, Italy). Genotyping was performed by the Versant HCV genotype v2.0 assay (LiPA) (Siemens, Healthcare Diagnostic Inc., Tarrytown, NY, USA). Fibrosis stage was estimated by transient elastometry (FibroScan, Echosens, Paris, France), interpreted as follows: F0–F1 = minimal fibrosis (KPa ≤ 7.1), F2 = moderate fibrosis (7.1 < KPa ≤ 9.5), F3 = severe fibrosis (9.5 < KPa ≤ 14.5), F4 = cirrhosis (KPa > 14.5) [14].

#### *2.3. Amplification and Sequencing of HCV NS5B and NS5A Regions*

Viral RNA was extracted from 140 μL serum samples using the QIAamp Viral RNA Extraction Kit (Qiagen, Hilden, Germany) in accordance with the manufacturer's instructions. Healthy donor serum samples were used as a negative control. The RNA was reverse transcribed using the High-Capacity cDNA Reverse Transcription Kit protocol (Thermo Fischer Scientific, Waltham, MA, USA) and cDNA amplified by nested PCR using the FastStart High Fidelity PCR system (Roche Diagnostics, Basel, Switzerland). The specific primers used to amplify the NS5B (nt 8256–8632) and NS5A (nt 6086–6722) regions of HCV genome for the first and second rounds have been previously described [15,16]. The products were purified using the High Pure PCR Cleanup Micro Kit (Roche Diagnostics, Basel, Switzerland) and analyzed on 2% agarose gel stained with GelRed (Biotium Corporete Headquarters, Biotium Inc., Fremont, CA, USA). Both strands were sequenced using the Genome Lab DTCS Quick Start KiT (Beckman Coulter, Inc., Fullerton, CA, USA). Sequencing reactions were run on an automated DNA sequencer (Beckman Coulter, Inc., Fullerton, CA, USA). HCV sequences were aligned by MAFFT under the Galaxy platform (https://usegalaxy.org/, accessed on 27 March 2020) and manually edited by using Bioedit [17–19].

#### *2.4. Subtyping Tool Analysis*

NS5B and NS5A sequences were analyzed using the Oxford HCV Automated Subtyping Tool v.2.0 (http://dbpartners.stanford.edu/RegaSubtyping/html/subtypinghcvSUB. html, accessed on 20 April 2020) and COMET HCV typing tool (https://comet.lih.lu/index. php?cat=hcv, accessed on 20 April 2020) followed by phylogenetic analysis (see below) to confirm the initial subtyping assignment by LiPA assay [20,21].

#### *2.5. Datasets Construction*

Two datasets were built. The first dataset contained 78 total sequences: 53 HCV NS5B new sequences from Italy (41 from Catanzaro University Hospital and 12 from Sersale) plus 25 HCV NS5B subtype specific reference sequences downloaded from the HCV Los Alamos sequence database (http://hcv.lanl.gov/content/index, accessed on 11 May 2020). The second dataset comprised 162 total sequences including: 53 HCV NS5B sequences from Italy, previously classified as 1b subtype, plus 109 foreign HCV 1b NS5B sequences downloaded from the HCV Los Alamos sequence database (http: //hcv.lanl.gov/content/index, accessed on 11 May 2020).

#### *2.6. Likelihood Mapping*

The phylogenetic signal of each sequence dataset was investigated by means of the likelihood mapping analysis of 10,000 random quartets generated using TreePuzzle [22]. Groups of four randomly chosen sequences (quartets) were evaluated. For each quartet, the three possible unrooted trees were reconstructed using the maximum likelihood approach under the selected substitution model. Posterior probabilities of each tree were then plotted on a triangular surface so that fully resolved trees fell into the corners and the unresolved quartets in the center of the triangle (star-like trees). When using this strategy, if more than 30% of the dots fall into the center of the triangle, the data is considered unreliable for the purposes of phylogenetic inference.

#### *2.7. Phylogenetic Analysis*

Sequences of all datasets were aligned using MAFFT under the Galaxy platform and manually edited using Bioedit [17–19]. The subtypes of the newly generated sequences from Calabria were determined and confirmed by phylogenetic analysis of the first dataset. The maximum likelihood phylogenetic trees of the first and second dataset together with the estimation of the best-fit substitution models (TPM2 + F + I + G4 and TVMe + I + G4 for the first and second dataset, respectively) were performed through IQ-TREE with the Model Finder option and visualized with FigTree v. 1.4.4 [23]. Statistical support for

internal branches of the maximum likelihood (ML) trees were evaluated by bootstrap analysis (1000 replicates) and fast likelihood-based sh-like probability (SH-aLRT).

#### *2.8. Genetic Variability Analysis*

HCV1b viral population for each patient was screened for genetic variation with a cut-off of 15% [1]. Forty-one NS5B and twenty-two NS5A sequences at specific nucleotide positions were analyzed. Non-synonymous and resistance-associated substitutions (RAS) were determined using the Geno2pheno (HCV) 0.92 tool (last updated: June 2019) and aligning generated sequences to HCV1b (AJ238799) reference by MAFFT [17,24]. Resistance prediction rules available in the online tool were implemented by literature search [25].

#### *2.9. Public Availability of the Sequencing Data*

The 41 NS5B and 22 NS5A newly generated sequences were submitted to the GenBank database [26]. All sequences can be retrieved from GenBank under accession numbers: MW357752-MW357814.

## **3. Results**

#### *3.1. Patient Demographic Characteristics and Risk Factors*

The median age of the 53 patients was 70 years (range 31–90), with 58.5% females. Overall, dental treatment and surgical intervention were the first (16.9%) and second (13.2%) most frequent risk factors, followed by blood transfusion (3.8%) and cohabitation (1.9%). Only one patient reported intravenous drug use as a risk factor. Three patients declared no risk factors. Qualitative characteristics of the two cohorts are reported separately (Table 1).


**Table 1.** Patient demographic characteristics.

\* Multiple risk factors were: surgical intervention + blood transfusion (*n* = 4), surgical intervention + blood transfusion + cohabitation (*n* = 2), blood transfusion + cohabitation (*n* = 1), blood transfusion + dental treatment (*n* = 2), dental treatment + cohabitation (*n* = 1), surgical intervention + cohabitation (*n* = 2), surgical intervention + dental treatment (*n* = 8), surgical intervention + drug user (*n* = 1), surgical intervention + dental treatment + cohabitation (*n* = 1), surgical intervention + glass syringes (*n* = 7), surgical intervention + cohabitation + blood transfusion + glass syringes (*n* = 1), cohabitation + glass syringes (*n* = 1). Characteristics heading and total number of patients were in bold.

#### *3.2. Likelihood Mapping*

The phylogenetic noise of each dataset was investigated by means of likelihood mapping (Figure S1). The percentage of dots falling in the central area of the triangles was 13.2% and 7.5% for the first and second datasets, respectively. As none of the datasets showed more than 30% of noise, all of them contained sufficient phylogenetic signal.

#### *3.3. Phylogenetic Analysis*

All new sequences were classified as subtype 1b by both Oxford and COMET subtyping tools, and by phylogenetic analysis. The ML phylogenetic tree of the first dataset showed that all the 53 sequences collected from the Calabria Region were in the same statistically supported clade, closely related to the references subtype 1b and were therefore classified as subtype 1b (Figure 1).

**Figure 1.** Maximum likelihood phylogenetic tree of the first HCV NS5B dataset. The tree was rooted by using the midpoint rooting method. Branch lengths were estimated with the best fitting nucleotide substitution model according to a hierarchical likelihood ratio test, and were drawn to scale with the bar at the bottom indicating 0.2 nucleotide substitutions per site. The values along a branch represent significant statistical support for the clade subtending that branch (bootstrap support >75%). The Italian (Calabria Region) sequences are highlighted in red. Clades 1b, 1c and 1a are also highlighted with brackets.

The maximum likelihood phylogenetic tree of the second dataset showed the presence of a supported cluster and a main statistically supported clade (Figure 2).

**Figure 2.** Maximum likelihood phylogenetic tree of the second dataset HCV1b dataset. The tree was rooted by using the midpoint rooting method. Branch lengths were estimated with the best fitting nucleotide substitution model according to a hierarchical likelihood ratio test, and were drawn to scale with the bar at the bottom indicating 0.07 nucleotide substitutions per site. The asterisk (\*) along a branch represents significant statistical support for the clade subtending that branch (bootstrap support >75%). The main statistically significant sequences are highlighted with brackets.

The supported cluster included three foreign (Morocco and US) related sequences. Within the main clade, the HCV1b Italian (Calabria Region) sequences were distributed in 16 statistically supported clusters. Twelve clusters (12/16, 75%), presented foreign HCV 1b reference sequences intermixed with sequences from Italy (Clusters: 2, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16). Four statistically significant monophyletic clusters, including only sequences from Calabria were also observed (clusters: 1, 3, 4, 12).

Cluster 1 included three sequences from Catanzaro (ISS 9, 29, 43) reporting the following risk factors: blood transfusion/cohabitation, surgery/dental treatment and dental treatment, respectively.

The sequence ISS38 was located in cluster 2 with one reference from Switzerland and three from Thailand. Cluster 3 was composed of two Calabrian sequences (ISS 12 and 32) characterized respectively, by the following risk factors: dental treatment and blood transfusion/dental treatment. Interestingly, cluster 4 included seven sequences (ISS 695; 1791, 1795, 1805, 1836, 1821, 1003), closely related to each other, all from Sersale village. The following risk factors were reported: surgery, cohabitation with HCV positive, sharing glass syringes and blood transfusion (ISS 695), surgery and sharing glass syringes (ISS 1003, 1805, 1821, 1836), no risk factors (ISS 1791, 1795). Cluster 5 included one isolate from Calabria (Catanzaro) reporting blood transfusion as a risk factor and related to a sequence from Tunisia. Cluster 6 was composed of three isolates (ISS10, 30 and 15) from Catanzaro (risk factors: blood transfusion, surgery/dental treatment and surgery) related to a sequence from Morocco. Cluster 7 included one isolate (ISS 41) from Catanzaro characterized by the following risk factor: surgery, related to one reference from Cyprus and one from Greece. Cluster 8 included isolate ISS49 (risk factors: surgery and blood transfusion), one reference from Cyprus and one from Uruguay. Cluster 9 included two isolates from Catanzaro, ISS6 and ISS25 (risk factors: surgery/multiple blood transfusion and surgery/dental treatment, respectively) related to a sequence from Cyprus and another from Portugal. Cluster 10 included eight isolates from Calabria (ISS 7, 21, 46, 16, 24, 1308, 164, 1741), three of which from Sersale, related to one reference from Argentina and three from Russia. Cluster 11 was characterized by three sequences from Catanzaro (ISS 13, 35 and 18) reporting the following risk factors (surgery, surgery/dental treatment, dental treatment, respectively) and related to a reference from Turkey. Cluster 12 included two isolates collected from Catanzaro (ISS 20 and 23) with risk factors: surgery/blood transfusion and surgery/dental treatment, respectively. Cluster 13 was composed of two sequences, the isolate ISS 44 from Catanzaro (reporting surgery/cohabitation as risk factors) related to a reference from Switzerland. Cluster 14 included three isolates from Catanzaro (ISS 11, 31 and 26) reporting the following risk factors (dental treatment, surgery, blood transfusion, and cohabitation) related to a reference from Japan. Cluster 15 included three sequences (ISS 3, 4 and 27) characterized by the following risk factors: surgery/dental treatment; surgery/blood transfusion; surgery/dental treatment/cohabitation and related to a reference from Nepal. Cluster 16 included seven isolates from Catanzaro (ISS 14, 36, 40, 1, 2, 42, 17) intermixed with many sequences sampled from different countries: USA, Greece, Austria, Argentina, Uruguay, France, Philippines, Switzerland, Thailand, Japan, China and Ireland.

#### *3.4. Substitutions on Target Regions in Patients Naïve to DAA*

The total (100%) of NS5B amplicons were sequenced. Nine (40%) NS5A amplicons were not successfully sequenced, while 10 sequences were not of suitable length for RAS screening. Available NS5A and NS5B sequences at the time of genotyping were screened for RASs and nonsynonymous substitutions. We identified the L159F NS5B substitution, conferring resistance to sofosbuvir (SOF), in 15/41 (ISS 6, 7, 13, 16, 18, 20, 21, 23, 24, 25, 28, 35, 44, 46, 50) isolates, among them 8/15 (ISS 6, 13, 21, 24, 25, 28, 35, 46) also carried the C316N NS5B related to dasabuvir resistance. In particular, frequency of detected NS5B RASs was 36.6%, while frequency of RASs carried on NS5A region was 22.7%.

The Y93H, associated with resistance to daclatasvir, elbasvir, ledipasvir (LPV), ombitasvir (OMV) and pibrentasvir was detected on NS5A in 3/22 (ISS 16, 24, 30) isolates. The

L31M substitution associated with resistance to all drugs mentioned above plus velpatasvir was found in 1/22 (ISS 21) isolate. Interestingly, all three isolates carried Y93H plus K108R substitution. The A92T NS5A OMV and LPV associated resistance was detected in 1/22 (ISS 2) isolate. Among patients who reported RASs in the viral population, seven have been previously treated with an IFN regimen with or without RBV and were classified as non-responders (4/7) or relapsers (3/7) with liver stiffness F3 or F4. On the other hand, the 33.3% of patients without RASs were IFN experienced with or without RBV. The median baseline RNA viral load was 2,280,000 IU/mL.
