**2. Results**

*2.1. Single-Genome Sequencing of Full-Length HIV-1 pol Gene in Longitudinal Samples from a Patient Failing RAL-Containing Therapy*

Single genomes were generated from five out of six samples over a 16-month period (Table 1). A total of 117 single genomes were generated at the following time points: before initiation of RAL therapy (preRAL; n = 16), 4 and 5 months after initiation of RAL therapy (4RAL and 5RAL; n = 26 and 23, respectively), 4 months after cessation of RAL therapy (4post; n = 39) and 2 weeks after re-initiation of RAL therapy (reRAL; n = 13).


**Table 1.** Clinical history of patient.

a na = no amplification, possibly due to low viral load.

Analysis of the 16 single genomes before exposure to RAL (preRAL) revealed no RAL resistance mutations, except for the presence of an amino acid substitution (G163E) in a single genome at a position associated with RAL resistance (Figure 1). On the other hand, all single genomes generated after initiation of RAL treatment (4RAL and 5RAL; n = 49) contained RAL resistance mutations at all the three major RAL resistance positions at 4RAL, with these being: Y143R/C (n = 15), Q148R (n = 10) and N155H (n = 1). The major RAL resistance pathways were reduced to two at 5RAL, with these being: Y143R/C (n = 16) and Q148R (n = 7). None of the three major RAL resistance-associated mutations at positions 143, 148 and 155 observed at 4RAL were found on the same genome. However, all three major RAL resistance mutations were found to be linked to one of the following accessory mutations: E92Q, T97A, G140A, V151I and G163R/K. Analysis of the genetic linkage of the major and accessory resistance mutations revealed seven different RALassociated haplotypes in the 49 single genomes during the first round of RAL treatment

(4RAL and 5RAL), with these being: Y143R + G163R (n = 27), Y143R + G163K (n = 1), Y143C + E92Q (n = 1), Y143C + G163R (n = 1), Y143C + T97A (n = 1), Q148R + G140A (17) and N155H + V151I (n = 1). Six of the haplotypes were present at 4RAL, and this number decreased to three a month later at 5RAL. However, at both time points, two haplotypes, Y143R + G163R and Q148R + G140A, dominated the population constituting 46.2% and 38.5% of the population at 4RAL and 65.2% and 30.4% at 5RAL, respectively.

**Figure 1.** Genetic linkage of drug resistance mutations in PR, RT and IN genes from patient failing RAL-containing therapy. (**A**) Genetically linked major PI, NRTI, NNRTI and RAL resistance mutations identified by single genome sequencing in longitudinal samples are shown in columns. The percent of single genomes containing the linked resistance mutations at each time point are shown in bar graphs above the mutation columns. The details of treatment are indicated above the bar graphs and the number of single genomes generated at each time point is indicated below each figure. The bar graph on the right-hand side shows the frequency of the mutations in the Stanford HIV Drug Resistance Database in treatment-experienced patients infected with subtype B. (**B**) Highlighter plot of amino acid differences throughout the pol gene of the patient single genomes during the different time points and treatment regimens. The differences are in reference to the ancestral single genome at the preRAL time point. preRAL = before initiation of RAL therapy; 4RAL, 5RAL and 8RAL = 4, 5 and 8 months after initiation of RAL therapy, respectively; 1post, 3post and 4post = 1, 3 and 4 months after stopping RAL therapy; reRAL = 0.5 months after reinitiating RAL therapy; \* = accessory RAL resistance mutations included; \*\* = L74V only; \*\*\* = T215Y only; \*\*\*\* = Y143R and Y143C only.

Within four months following the withdrawal of RAL treatment (4 post), 97.4% (38/39) of the single genomes contained no RAL resistance-associated mutations, with only one single genome still harbouring the G163R mutation and a novel substitution at major resistance position 143 (Y to G). This minor variant was not detected in any of the single genomes at 4RAL or 5RAL. Two weeks after RAL therapy was re-initiated (ReRAL), this novel Y143G + G163R mutant dominated the viral population with all of the single genomes (n = 13) containing the double mutation.

#### *2.2. The Development and Linkage of Drug Resistance Mutations in Full-Length HIV-1 pol Gene*

We investigated the linkage of RAL resistance mutations in IN to drug resistance mutations in PR and RT. We found little variation in the composition of PI and RTI resistance mutations over the sampling period. All 117 amplified single genomes contained numerous PI (V32I, I47V, I54L, I84V, and L90M), NRTI (M41L/I, D67N, L74V/I, M184V, T215Y/C and K219E) and NNRTI (L100I, K103N and N348I) resistance mutations (Figure 1). This is consistent with the extensive ART experience of the patient.

These PI and RTI resistance mutations were maintained in the viral population, even in the absence of the respective drugs. For example, the PI resistance mutations in the preRAL time point were still present at the 4post time point when the patient was no longer on PIs (Figure 1). We observed the genetic linkage of drug resistance mutations across the full-length *pol* gene in 100% (n = 64) of single genomes that contained RAL resistance mutations (Figure 1A). Analysis of the Stanford HIV Drug Resistance Database showed that these mutations were identified in a significant proportion of patient samples infected with subtype B, except for the PI V82L, NRTI L74I and T215C, and InSTI Y143G (Figure 1A). However, the frequency of linkage of the mutations could not be verified, as the data were from population-based Sanger sequencing, and not single-genome sequencing.

Although the major PI and RTI resistance mutations were maintained throughout the study period, regardless of treatment regimen, there was selection for or against the major RAL resistance mutations and other mutations in all three gene regions, depending on the treatment regimen (Figure 1B).

#### *2.3. Intrapatient Evolution of InSTi Susceptibility*

We investigated the effect of the different drug resistance-associated mutations identified during the development of RAL resistance on RAL susceptibility. We generated nine recombinant virus vectors expressing patient-derived *IN* genes with eight different RAL-resistant haplotypes and a wild-type sequence that were identified by single genome sequencing (Figure 2). As expected, the RAL EC50 of the virus expressing the patientderived wild-type *IN* from the pre-RAL time-point (ptA\_WT*IN*) was equivalent to that of the wild-type control virus (p8.9NSX) at 4.2 ± 0.11 vs. 4.5 ± 0.37 nM (*p* = 0.76; Figure 3A). In contrast, all viruses containing patient-derived *IN* from single genomes sampled during RAL treatment exhibited significant decreases in RAL susceptibility of up to 200-fold. compared to that of the wild-type *IN* control with EC50 values ranging from 84.6 ± 10.5 to 900 ± 145.2 nM (*p* ≤ 0.0001). Interestingly, the major RAL-resistant haplotype at 4RAL and 5RAL, Y143R + G163R, had the highest decrease in RAL susceptibility (200-fold), compared to a moderate decrease in RAL susceptibility of 50-fold for the Y143G + G163R haplotype, that emerged during re-initiation of RAL therapy.

**Figure 2.** A schematic representation of recombinant viral vectors used in phenotypic drug susceptibility and viral fitness assays. Patient-derived *IN* genes containing different RAL-resistant haplotypes or wild-type only were sub-cloned into p8.9NSX vector. Patient-derived full-length *pol* vectors were generated by subcloning the patient-derived *IN* genes into a vector expressing the PR + RT fragment from the patient. The patient-derived *PR* + *RT* fragment contained the PI resistance mutations L10F, V32I, I47V, I54L, A71T, I84V and L90M, and RTI resistance mutations M41L, D67N, I74V, M184V, T215Y, K219E, L100I, K103N and N348I. Wild-type control = p8.9NSX wild-type subtype B; ptA\_WT*IN* = patient-derived wild-type in *IN*; ptA\_*PR* + *RT* = patient-derived *PR* and *RT*; ptA\_*pol* = patient-derived full-length *pol*, wild-type in *IN*. Vectors with patient-derived full-length *pol* containing RAL-resistant haplotypes are not shown.

**Figure 3.** Susceptibility to RAL exhibited by recombinant viruses expressing patient-derived HIV-1 gene fragments. (**A**) Susceptibility to RAL exhibited by recombinant viruses expressing patientderived *IN* genes only. (**B**) Comparison of RAL susceptibilities exhibited by recombinant viruses expressing patient-derived *IN* genes only or full-length *pol* genes. Error bars represent standard error of the mean of 6 to 12 independent experiments. Fold change in EC50 values compared to the p8.9NSX wild-type control are indicated next to each bar. Viruses exhibiting a significantly higher RAL EC50 (*p* < 0.05) compared to wild-type control (**A**) or their full-length *pol* counterpart (**B**) are indicated with \*.

Next, we investigated the susceptibility of the viruses expressing patient-derived fragments against a different InSTI, EVG. Interestingly, the EVG EC50 of the virus expressing the patient-derived wild-type *IN* from pre-RAL time point (ptA\_WT*IN*) was significantly higher compared to the wild-type control (p8.9NSX) at 0.24 ± 0.026 vs. 0.1 ± 0.016 nM (*p* = 0.0002; Figure 4A). This is consistent with other studies, that have shown a 2- to 3-fold increase in EVG EC50 for viruses expressing *IN* genes from InSTI-naïve patients compared to wild-type controls [17,18]. Two viruses expressing patient-derived *IN* containing the RAL resistance mutations, Y143C + G163R and Y143G + G163R, also exhibited EVG EC50 values similar to that of the wild-type control at 0.06 ± 0.022 and 0.22 ± 0.088 nM, respectively (*p* ≥ 0.12). The remaining six viruses expressing patient-derived *IN* only were found to have significantly higher EVG EC50 values compared to the wild-type control, ranging from 0.48 ± 0.064 to 23.31 ± 0.69 nM (*p* < 0.05). This includes the virus expressing the Y143R + G163R mutation combination, which exhibited the highest fold change in RAL EC50 (200-fold), but which only resulted in a 5-fold change in EVG susceptibility. The highest fold change in EVG susceptibility (227-fold) was exhibited by the virus expressing the Q148R + G140A mutation combination, which also resulted in a very high fold change in RAL susceptibility (147-fold).

**Figure 4.** Cross-resistance to EVG exhibited by recombinant viruses expressing patient-derived HIV-1 gene fragments. ( **A**) Susceptibility to EVG exhibited by recombinant viruses expressing patientderived *IN* genes only. (**B**) Comparison of EVG susceptibilities exhibited by recombinant viruses expressing patient-derived *IN* genes only or full-length *pol* genes. Error bars represent standard error of the mean of 6 to 12 independent experiments. Fold change in EC50 values, compared to the p8.9NSX wild-type control are indicated next to each bar. Viruses exhibiting a significantly higher EVG EC50 (*p* < 0.05) compared to wild-type control ( **A**) or their full-length *pol* counterpart (**B**) are indicated with \*.

#### *2.4. The Effects of Coevolved PR and RT Genes on Susceptibility to InSTIs*

To determine if the coevolved *PR* and *RT* genes influence the susceptibility to the InSTIs RAL and EVG, we generated recombinant vectors expressing patient-derived fulllength *pol* gene. The virus expressing the patient-derived full-length *pol* with resistance mutations in *PR* and *RT* but wild-type in *IN* (designated ptA\_*pol*), as well as the virus expressing the patient-derived *PR* and *RT* only (designated ptA\_*PR* + *RT*) had RAL susceptibilities comparable to that of the wild-type control virus (p8.9NSX) with EC50 values of 5.5 ± 1.1 nM, 4.1 ± 0.48 and 4.5 ± 0.37 nM, respectively (*p* ≥ 0.30; Figure 3B). Overall, the viruses expressing patient-derived full-length *pol* from time points during RAL treatment exhibited RAL susceptibilities that were similar to viruses expressing respective patient-derived *IN* only, except for the Y143R + G163R, Y143R + G163K and Y143C + T97A mutation combinations. For these three viruses, the patient-derived *IN* only showed significantly greater decreases in RAL susceptibilities than the respective full-length *pol* (*p* ≤ 0.032). This suggests that the coevolved *PR* and *RT* can confer a negative effect on resistance to RAL, which is dependent on the combination of resistance mutations in *IN*.

To determine if the coevolved *PR* and *RT* also influences EVG susceptibility, we investigated the EVG susceptibilities of viruses expressing patient-derived full-length *pol* (Figure 4B). Similar to the virus expressing patient-derived wild-type *IN* gene only (ptA\_WT*IN*), the virus expressing patient-derived full-length *pol* with resistant *PR* and *RT* genes but wild-type *IN* gene (ptA\_*pol*) also exhibited a significantly higher EVG EC50 compared to wild-type control virus at 0.18 ± 0.033 vs. 0.1 ± 0.016 nM (*p* = 0.03). Overall, the EVG EC50 values of viruses expressing patient-derived full-length *pol* were similar to that of viruses expressing patient-derived *IN* gene only, with the exception of the virus expressing Y143C + E92Q mutations, which showed a significant decrease in EVG susceptibility for patient-derived *IN* gene only compared to full-length *pol* gene (*p* = 0.0064). Again, this may indicate that the patient coevolved *PR* and *RT* genes' influence on the susceptibility to EVG is dependent on the combination of InSTI resistance mutations; however, this effect is different between EVG and RAL.

#### *2.5. The Effects of Patient-Derived pol Gene Fragments on Viral Replicative Fitness*

Next, we tested the replicative fitness of the viruses expressing patient-derived fulllength *pol* or *IN* gene only (Figure 5). Using a single-replication-cycle assay, the wild-type control virus, the virus expressing patient-derived *PR* + *RT* (ptA\_*PR* + *RT*) and the virus expressing patient-derived full-length *pol* with wild-type *IN* showed similar replicative fitness to the virus expressing patient-derived wild-type *IN* only (ptA\_WT*IN*; set to 100%) at 110.5 ± 7.7%, 109.7 ± 12.3% and 102.9 ± 8%, respectively (*p* ≥ 0.39).

Interestingly, the only other virus that showed replicative fitness comparable to that of ptA\_WT*IN* was the patient-derived full-length *pol* or *IN* only virus, expressing the rare *IN* Y143G + G163R mutation combination at 86.9 ± 7.6 and 102 ± 6.6%, respectively (*p* = 0.28 and 0.85). All other viruses expressing patient-derived full-length *pol* or *IN* only had significantly lower replicative fitness than ptA\_WT*IN*, ranging from 12.9 ± 0.6 to 72.9 ± 3.4% (*p* ≤ 0.01).

In general, the replicative fitness of viruses expressing full-length *pol* was greater than or comparable to that of viruses expressing the respective *IN* gene only. The viruses showing significantly increased replicative fitness upon expression of patient-derived fulllength *pol* relative to *IN* gene only were those with the following RAL resistance mutation combinations: Q148R + G140A (*p* = 0.0023), Y143R + G163R (*p* = 0.024), Y143R + G163K (*p* ≤ 0.0001) and Y143C + G163R (*p* = 0.013). Furthermore, viruses that dominated the population during the two RAL treatment phases had both high levels of RAL resistance and replication fitness (Figure 6).

**Figure 5.** The effect on viral replicative fitness of patient-derived *IN* gene only or full-length *pol* gene. Viral infectivity in a single-replication-cycle assay was used to determine the replicative fitness of recombinant viruses expressing patient-derived *IN* gene only or full-length *pol* genes. The replicative fitness relative to ptA\_WT*IN* control (vector containing patient-derived *IN* only from the pre-RAL time point), set at 100%, is shown for each virus. Significant differences (*p* < 0.05) between patientderived *IN* only viruses and their full-length *pol* counterparts are indicated with \*. The error bars represent standard error of the mean of six independent experiments.

**Figure 6.** Relationship between the replicative fitness and RAL susceptibility exhibited by recombinant viruses from ptA. A graph plotting the relationship between replicative fitness and RAL susceptibility of recombinant viruses expressing either patient-derived *IN* gene only or full-length *pol* gene. The graph is equally divided into four hypothetical quadrants: (**A**) high replicative fitness (>50%) and low RAL resistance (<450 nM), (**B**) high replicative fitness (>50%) and RAL resistance (>450 nM), (**C**) low replicative fitness (<50%) and RAL resistance (>450 nM), (**D**) low replicative fitness (<50%) and high RAL resistance (<450 nM). Triangles represent viruses expressing patientderived *IN* gene only; circles represent the respective viruses expressing full-length *pol* gene. The shaded oval represents viruses that dominated the viral population during RAL-containing salvage therapy at 4RAL, 5RAL and reRAL.
