**1. Introduction**

Human papillomavirus (HPV) is still the most prevalent viral sexually-transmitted infection either in men or women. Clinically, it is characterized by a wide spectrum of manifestations, including premalignant lesions that regress spontaneously and malignant lesions evolving to cervical cancer (CC) [1].

Worldwide, although cervical screening programs have contributed to a decrease in the incidence, CC continues to be the second most common cancer among women, with an estimated 266,000 deaths for year [2].

Nevertheless, the use of combined tests to detect the presence of HPV DNA together with conventional cytology examination has been shown to greatly improve the ability to detect the pre-cancerous states [3,4]. Nowadays, with the aim of detecting HPV DNA a wide variety of laboratory diagnostic methods characterized by di fferent grade of sensitivity and specificity are developed [5–8].

Until a few years ago, PCR followed by the nucleic acid hybridization techniques were used to detect HPV genotypes. When compared to conventional cytology, these techniques provided more detailed information regarding HPV genotypes [9].

Afterwards, for HPV diagnosis real-time PCR techniques were introduced. Their performance has significantly improved both the hands-on time and decreased contamination rates.

This study covers three years of routine diagnostic data on HPV DNA and aims to retrospectively evaluate the impact on HPV prevalence and HPV genotypes distribution of a switch from a nested-based PCR to a real-time based PCR on genital samples collected from patients in the Apulia region. To address this issue, a quasi-experimental approach for evaluating the e ffect of the real-time based PCR on the HPV prevalence and the prevalence of the single viral genotypes was based on the application of multiple logistic regression.

#### **2. Materials and Methods**

#### *2.1. Clinical HPV Isolates and Patient Population Characteristics*

From January 2012 to December 2014, 1742 consecutive samples, including 1605 cervico-vaginal swabs from 1328 females and 137 urethral swabs from 105 males were collected. Multi samples for some patients were due to retesting in di fferent times.

Specimens were transferred to the laboratory of Molecular Biology, U.O.C. Microbiology and Virology, Azienda Ospedaliera-Universitaria, Policlinico of Bari, where they were analyzed.

All procedures performed in studies involving human participants were in accordance with the ethical standards.

Sample information (date of sampling, ward, type of specimen, testing results) together with the data of patients for whom molecular testing was performed (i.e., age and sex) were recorded in an anonymous database by changing sensitive data into alphanumeric codes. No clinical data associated with these specimens were available.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. For this type of study, formal consent is not required. This study was approved by Ethics Committee (No. 5481, 13 December 2017) Azienda Ospedaliero-Universitaria "Consorziale Policlinico," Bari.

#### *2.2. Treatment of Samples*

A total of 2 mL of phosphate-bu ffered saline (pH 7.4) (Sigma-Aldrich, Milano, Italy) was added to cervical-vaginal swabs, collected by a rigid cotton-tipped swab applicator (Nuova Aptaca, Cannelli, Italy), and vortexed. Then, 1 mL of phosphate bu ffered saline (Sigma, Milano, Italy) was added to urethral swabs and vortexed. Finally, all samples were transferred to microcentrifuge tubes and they were stored at −20 ◦C until processing. To extract viral nucleic acids, microcentrifuge tubes were centrifuged at rcf = 15,700× *g* for 15 min at 7 ◦C. The majority of supernatant was discarded but 200 μL of supernatant was retained to resuspend the pellet.

#### *2.3. DNA Isolation (QIAcube System vs. MagNa Pure 96 System)*

From January 2012 to December 2013, DNA extraction was performed by automated QIAcube System (Qiagen, Hilden, Germany), following the manufacturer's protocols.

From January 2014 to December 2014, the vilrral nucleic acids were extracted from the resuspended pellet using the automated MagNa Pure 96 system (Roche Diagnostics GmbH, Mennheim, Germany) according to the manufacturer's instructions.

#### *2.4. DNA Amplification (Nested-PCR vs. Multiplex Real-Time PCR)*

From January 2012 to June 2013, the extracted DNA samples were subject to a nested polymerase chain reaction (PCR) amplification, using Ampliquality HPV-HS Bio Kit (AB Analitica, Padova, Italy) following the manufacturer's instructions.

The method provides for a first amplification of the viral genome L1 region, followed by a nested PCR with biotinylated primers. PCR products were analyzed using 3% agarose gel electrophoresis with ethidium bromide to display DNA under ultraviolet light. Subsequently, PCR products were typed by using Reverse Line Blot Hybridization Ampliquality HPV-Type Kit (AB Analitica, Padova, Italy).

To assess the suitability of extracted DNA, the thiosulfate sulfurtransferase (TST) gene region (202 bp) was amplified at the first amplification.

From July 2013 to December 2014, the extracted DNA samples were subject to multiplex real-time PCR (mRT-PCR) by AnyplexTM II HPV 28 Detection System (Seegene, Seoul, Korea), which targets the viral L1 region and provides simultaneous detection and genotyping of 28 HPV-types. Briefly, the detection consists of two PCR reactions (panel A and B). The panel A includes 14 high-risk HPV (HR/HPV)-types, while the panel B includes 5 HR and 9 low-risk (LR)-types. PCR was performed on the CFX96 Real-Time PCR system (Bio-Rad, Hercules, CA, USA).

#### *2.5. Statistical Analysis*

Di fferences in HPV prevalence were evaluated by Chi-Squared test and Fisher's test as appropriate. *p*-values were corrected by Benjamini and Hochberg's (BH) procedure with False Discovery Rate (FDR) <1% [10]. Pairwise comparison was performed on each statistically significant combination group of extraction and amplification methods by Fisher's test and BH's correction with FDR <1%.

To assess the association of the real-time assay on the prevalence of the HPV overall infection and the prevalence of each HPV genotype (dependent variables) on the analyzed samples (samples dataset), logistic regression analysis was performed. Due to the lack of birth date for 102 patients, missing ages were imputed by multiple imputation by fully conditional specification implemented in the Multivariate Imputation by Chained Equations (MICE) package implemented in the environment R. A predictive mean matching imputation model was specified on the assumption of missing at random ages and the number of iterations was set to 20. In particular, 50 imputed data sets were generated. For each dataset, a logistic regression model was generated, and the 50 models were pooled together by the function pool of the package mice. Globally, 27 logistic regression models were evaluated. All *p*-values collected from the logistic regression models were corrected for multiple comparisons by BH procedure with FDR < 1%.

Logistic regression analysis is based on the assumption of independence of the variables. To verify this assumption, a reduced dataset only containing the first sample for each patient (patient dataset) was generated and all analyses were repeated on it. Odds ratio estimations of the logistic regression models based on the samples and the patients' datasets were compared.

Calculations of all statistical tests were performed by the open source environment R [11].
