**4. Discussion**

Previously, we described that 93.3% of HPV-positive anal cancer cases diagnosed in Scotland between 2019 and 2018 were caused by HPV 16. In this study, we have identified that 76% of cases belonged to the A1 sub-lineage, followed by A2 (16%).

In the control group of asymptomatic men, a similar prevalence of A1 and A2 was observed. Differences identified were the presence of A4 in the anal cancers (4.7%), which was absent in the control group; presence of the C lineage only detected in the control group; and the presence of the sub-lineage D1 in the control group (3%), which had a lower prevalence of 0.81% in the cancers. This higher prevalence of sub-lineages A1 and A2 is consistent with previously published studies in European cohorts. Gonçalves et al. (2022) found a higher prevalence of the A lineage in the anal canal of asymptomatic men, mainly A1 [29], and Nicolás-Párraga et al. (2016) found that A1–3 sub-lineages were identified in 96.1% of the European cases [30]. Beyond Europe, Volpini et al. (2017) investigated the HPV 16 variants in cervical and anal samples collated in Brazil and determined that proportionally less of the anal cancer samples (70.8%) were classified as A1–3 sub-lineages [12].

The data collated in the study add to the limited information on the pattern and implications of HPV sub-lineages in the anus. Though we did not see significant associations with demographic and underlying disease status, these observations need to be confirmed or refuted by future studies with larger sample sizes.

To our knowledge, no other studies have investigated the association of HPV 16 sublineages in anal cancer and overall survival. We did not observe that A1 vs. non -A1 sublineages influenced overall survival in the univariate and adjusted analysis. Interestingly, a recent study was reported by Lang Kuhs et al. (2022) in which the authors looked into the genetic variation of HPV 16 and its association with clinical outcomes in HPV 16-positive oropharyngeal cancer patients [31]. They investigated different high-risk single nucleotide polymorphisms (SNPs) and found that those with one or more high-risk SNPs had significantly shorter median survival times. Most of these SNPs were common to the D2 sub-lineage, which has also been associated with a higher risk of cancer in the cervix [14]. Due to the absence of D2 cases in the present study, we were not able to explore

this in the present work; however, the identification of these high-risk SNPs may be very helpful for patient and treatment management.

Although we identified potential integration of the HPV16 genome (calculated through the loss of the sequence), due to the small number, we did not perform any further analysis, including in relation to implications for survival. Given the relative lack of information on the extent and implications of integration in anal cancer, we would assert that this is an area that would benefit from further study.

We acknowledge this study has limitations; the asymptomatic population were all men, whereas the cancer population had a majority of female (75.63%) samples compared to males (24.37%); this was due to pragmatic reasons relating to available material. However, data did not show differences in the distribution of HPV 16 sub-lineages between women and men in the anal cancer group. Additionally, as discussed earlier, we believe the observations made in the present work would benefit from validation in a larger sample of cases and controls and would hope this study serves as a primer for such. Though the number of cases of cancers was not trivial (*n* = 253), particularly given that the Scottish European age-standardized rate (EASR) (per 100,000 person-years at risk) was 2.6 in 2017, we appreciate that detecting rarer sub-lineages with precision can take large sample sizes.

In the UK, there is no screening program for anal cancer. However, since 2017, there has been an opportunistic vaccination program for MSM, and in 2019, the national HPV vaccination became gender neutral. In term of vaccines, a study from Godi et al. (2019) reported that HPV 16 lineage variants B, C and D exhibited slightly (<two-fold) reduced sensitivity to nonavalent vaccine sera compared to lineage A [32].

Therefore, the high prevalence of lineage A in the samples included in this study could be interpreted as positive for vaccine efficacy, particularly given that gender-neutral vaccination is now a part of core policy in the UK and several other countries.

This study has demonstrated the technical feasibility of detecting HPV 16 sub-lineages in anal cancer samples and residual material from rectal swabs. Though some differences in the presence of non-A sub-lineages were detectable between the cancer and asymptomatic population, the consistency, magnitude and implications of these would benefit from further study. The domination of lineage A is consistent with existing European data and suggests that sub-lineage identification in itself may not be informative for prognostication.

**Author Contributions:** D.G. was involved in the planning of experiments, delivered the end-to-end whole genome sequencing process and performed data analysis, including the analysis of nextgeneration sequencing data. D.G. also drafted the manuscript. L.S.A.M. assisted with the planning of laboratory experiments, supporting with quality checking/analysis of data, including sequencing data, and performing critical appraisal of the manuscript. R.G. performed original data retrieval on the clinical cohort, including the collation of clinical-demographic variables, and supported critical appraisal of the manuscript. M.T.G.H. was the lead academic supervisor for the project and was involved in advising on experimental methodology and technology, providing support for data analysis and performing critical appraisal of the manuscript. K.C. was the principal clinical investigator for the project and supported with interaction with the bio-resource and pathology team for sample collation, advising on experimental and analytical methodology, providing support for data analysis and assisting in the drafting and critical appraisal of the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Use of samples for the present project was approved by the Southeast of Scotland National Research for Scotland Bioresource (NRS) (application reference SR 1283 (24 September 2019) and SR1364 (22 January 2020)). Favorable ethical opinion to conduct the research was provided by University of St Andrews Teaching and Research Ethics Committee, reference MD 14482 (5 July 2019).

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Request for data in anonymized form can be made available upon reasonable request to the senior author and following due process of governance and the Scottish Data Protection Regulations. GenBank submission IDs 2637056 and 2638666.

**Conflicts of Interest:** D.G.: Received gratis consumables from Seegene to support the HPV genotyping of the anal cancer samples. K.C.: K.C.'s institution has received research funding or gratis consumables to support research from the following commercial entities in the last 3 years: Cepheid, Euroimmun, GeneFirst, SelfScreen, Hiantis, Seegene, Roche, Abbott and Hologic. All other authors have nothing to declare.
