Next Article in Journal
Leveraging ChatGPT and Long Short-Term Memory in Recommender Algorithm for Self-Management of Cardiovascular Risk Factors
Previous Article in Journal
Forecasting and Multilevel Early Warning of Wind Speed Using an Adaptive Kernel Estimator and Optimized Gated Recurrent Units
Previous Article in Special Issue
Influence of the Effective Reproduction Number on the SIR Model with a Dynamic Transmission Rate
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Validation of a Multi-Strain HIV Within-Host Model with AIDS Clinical Studies

by
Necibe Tuncer
1,*,
Kia Ghods
2,†,
Vivek Sreejithkumar
1,†,
Adin Garbowit
1,†,
Mark Zagha
1,† and
Maia Martcheva
3
1
Department of Mathematics and Statistics, Florida Atlantic University, Boca Raton, FL 33431, USA
2
Department of Mathematics, Princeton University, Princeton, NJ 08544, USA
3
Department of Mathematics, University of Florida, P.O. Box 118105, Gainesville, FL 32611, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2024, 12(16), 2583; https://doi.org/10.3390/math12162583
Submission received: 28 June 2024 / Revised: 12 August 2024 / Accepted: 16 August 2024 / Published: 21 August 2024
(This article belongs to the Special Issue Advances in Mathematical Biology and Applications)

Abstract

:
We used a previously introduced HIV within-host model with sensitive and resistant strains and validated it with two data sets. The first data set is from a clinical study that investigated multi-drug treatments and measured the total CD4+ cell count and viral load. All nine patients in this data set experienced virologic failure. The second data set includes a unique patient who was treated with a unique drug and for whom both the sensitive and resistant strains were measured as well as the CD4+ cells. We studied the structural identifiability of the model with respect to each data set. With respect to the first data set, the model was structurally identifiable when the viral production rate of the sensitive strain was fixed and distinct from the viral production rate of the resistant strain. With respect to the second data set, the model was always structurally identifiable. We fit the model to the first data set using nonlinear mixed effect modeling in Monolix and estimated the population-level parameters. We inferred that the average time to emergence of a resistant strain is 844 days after treatment starts. We fit the model to the second data set and found out that the all the parameters except the mutation rate were practically identifiable.

1. Introduction

HIV was first detected in 1981 when several people were found to have opportunistic infections and severe immunodeficiency [1]. In the initial years, HIV-infected individuals were treated with a single drug, which gave rise to a resistant strain fairly quickly [2,3]. In 1995, the highly active antiretroviral therapy (HAART) was introduced, and the patients started being treated with several drugs of different classes [1]. The new treatment was expected to reduce the HIV resistance and therapy failure [4]. However, even when patients were treated with 2–5 medications from 2–3 different classes [5], rebound and treatment failures still occurred [6]. HIV resistance and viral rebound continue to present a puzzle that needs to be better understood [7].
Within-host modeling of HIV and HIV drug resistance has a long history [3,8,9,10]. Multi-strain HIV models for resistance evolution have been developed in many studies [11,12]. A model of this type was fully analyzed in [13], and a model with immune response was analyzed in [14]. While the analysis and simulations of multi-strain within-host models of HIV are well developed, including in the context of multi-scale models [15], very little has been done in connecting these models to within-host HIV data. Data and single-strain within-host models have played a significant role in understanding HIV [16], and yet multi-strain models have not been fit to data, and implications have not been studied based on data. In this study, we fit a two-strain within-host model to within-host data of HIV. One reason for the gap in the literature is perhaps the fact regarding how within-host HIV data are usually collected. Typically, only the viral load, and possibly the CD4 cell count, are measured. This way of collecting data leaves doubt about whether the parameters in a multi-strain model can be identified from so few data sets. To address this problem, we studied the identifiability of the model relative to the data given. The problem of identifiability of the parameters was first considered in an epidemic model in [17], and since then, it has been applied to other epidemic and immuno-epidemic models [18,19]. Identifiability has been applied to within-host single-strain HIV models [20]; however, we are not aware of identifiability analysis being applied to multi-strain models, even to multi-strain models studied extensively in the literature, such as the one we consider in this article [3,8,9]. We studied the identifiability of our previously studied two-strain model with respect to two data sets. Data set one had the within-host viral load and CD4+ cells of patients subject to multiple drugs in multiple classes of drugs. Further, we studied the identifiability of our model with respect to another data set, data set two, which contained data on the sensitive and resistant strain as well as CD4+ cells for a patient treated with one drug [3]. With respect to data set two, our model is completely structurally identifiable; however, collecting strain-specific data is possibly expensive and difficult, particularly in patients treated with multiple drugs. For that reason, we studied the structural identifiability of our model with respect to data set one. We further studied the practical identifiability to understand better how well the specific data allow us to estimate the parameters.
This paper is structured as follows. In Section 2, we introduce the model. In Section 3, we perform the identifiability analysis of the model. In Section 3, we study the structural identifiability with respect to both data sets. In Section 4, we fit the model to the two types of data sets, and we estimate the parameters. In the case of Data Set 1, we used Monolix [21] to derive population-level parameters as well. In Section 4, we also study the practical identifiability of the model with respect to each of the two types of data sets. Section 5 contains the discussion of our results.

2. Model Formulation

HIV primarily targets and infects CD4+ T cells, which play an important role in the adaptive immune response. In return, infected CD4+ T cells produce HIV viral particles. For the dynamics of sensitive and resistant HIV, we used the following well-studied HIV within-host model [3,9].
d T d t = λ d T k s V s T k r V r T d T s d t = ( 1 u ) k s V s T δ T s d V s d t = π s T s c V s d T r d t = u k s V s T + k r V r T δ T r d V r d t = π r T r c V r
These differential equations describe the dynamics of CD4+ T cells (T), cells infected with drug-resistant virus ( T r ), cells infected with drug-susceptible virus ( T s ), drug-resistant viral load ( V r ), and drug-susceptible viral load ( V s ). The parameter λ represents the recruitment rate of uninfected CD4+ T cells, d is the per capita death rate of uninfected T-cells, and k r and k s are the infection rates of target cells by drug-resistant and drug-susceptible viruses, respectively. Regarding the interplay between susceptible and resistant strains of the virus, we set u ( 0 u < 1 ) to define the mutation rate from the sensitive strain to the resistant strain of the virus. π r and π s represent the burst sizes of the drug-resistant and drug-susceptible virus, respectively, which are the total number of virus particles released by a productive infected cell over its lifespan. c is the clearance rate of the virus, while δ is the death rate of infected CD4+ T cells.
Figure 1 depicts a flow diagram of the model. Table 1 lists the parameters and state variables of the within-host model (1).

3. Identifiability Analysis

Identifiability analysis is an assessment of whether a set of observations can be used to uniquely (structural identifiability) and accurately (practical identifiability) estimate model parameters. Within identifiability analysis, there are two sequential stages: (i) structural identifiability and (ii) practical identifiability. Structural identifiability is used to evaluate whether the parameters of a within-host model can be uniquely derived from infinitely many noise-free data. Structural identifiability analysis relies on the relation between a model and the observations. Consequently, structural identifiability analysis is a prerequisite for parameter estimation. If the model is structurally identifiable, then we may continue to estimate the parameters from the experimental data. Through practical identifiability analysis, one evaluates the parameters of the within-host model to see if they can be determined from experimental data with varying degrees of noise. There are a variety of methods utilized to study structural identifiability, including but not limited to the Taylor series approach [22], similarity transformation [23], generating series method [24], and the differential algebra approach [25,26]. Similarly, there are several methods to study practical identifiability of within-host models. These methods are profile likelihood [27], Monte Carlo simulations [19,28,29], and the Fisher Information Matrix (FIM) or Correlation Matrix [30,31,32]. In this study, we use the differential algebra method to study structural identifiability and the Monte Carlo simulations to examine the practical identifiability.

Structural Identifiability Analysis

We begin by assessing the structural identifiability of the within-host model and rewrite the model (1) in the following compact form:
x ( t ) = f ( x ( t ) , p ) , x ( 0 ) = x 0
y ( t , p ) = g ( x ( t ) , p )
Here, t represents time, x represents the state variables, and p represents the parameter vector with p = ( λ , δ , c , d , k s , k r , π r , π s , u ) . We refer to y ( t , p ) as observations; it is a smooth curve where the data are measured in discrete time. The observations, y ( t , p ) , are functions of the state variables. The within-host model given in compact form (2) is identifiable if the parameter vector p can be uniquely determined from the given observation y ( t , p ) , (3). Otherwise, it is said to be unidentifiable. To begin, we provide the definition of structural identifiability [19,29].
Definition 1.
  • Globally identifiable: Let p and p ^ be two distinct parameter vectors. The model (2) is said to be structurally globally (uniquely) identifiable if
    g ( x ( t ) , p ) = g ( x ( t ) , p ^ ) implies p = p ^ .
  • Locally identifiable: The model (2) is said to be locally structurally locally identifiable if for any p within an open neighborhood of p ^ in the parameter space,
    g ( x ( t ) , p ) = g ( x ( t ) , p ^ ) implies p = p ^ .
In AIDS clinical studies, the viral load and total CD4+ T-cell counts are usually measured. Therefore, in this study, we performed two the observations: the first one was the target cells, denoted by y 1 ( t , p ) , and the second one was the total viral load, denoted by y 2 ( t , p ) . With regard to the model variables, the observations can be written as follows:
y 1 ( t , p ) = T ( t ) + T s ( t ) + T r ( t ) and y 2 ( t , p ) = V r ( t ) + V s ( t ) .
To analyze the structural identifiability of the model, we utilized the differential algebra method [20]. The differential algebra method allows for the removal of the unobserved state variables, and by doing so, we derive an equation referred to as an input–output equation, which involves only the model parameters and observed state variables. The following input–output equations for model (1) with the observations y 1 ( t , p ) and y 2 ( t , p ) were obtained using the Differential Algebra for Identifiability of SYstem (DAISY v2.1) software [26]. The input–output equations of the system (1) with the observations (4) are the following monic differential polynomials (5) and (6).
0 = y 1 + y 1 y 2 k r k s ( π r π s ) ( 1 u ) π r ( k r k s u ) k s π s ( 1 u ) + y 1 ( d + δ ) + y 2 ( d δ ) ( k s k r ) π r ( k r k s u ) k s π s ( 1 u ) + y 2 ( c + δ ) ( d δ ) ( k s k r ) π r ( k r k s u ) k s π s ( 1 u ) + y 1 y 2 δ k r k s ( π r π s ) ( 1 u ) π r ( k r k s u ) k s π s ( 1 u ) + y 1 d δ + y 2 ( c d δ c δ 2 ) ( k s k r ) + k r k s λ ( π r π s ) ( 1 u ) π r ( k r k s u ) k s π s ( 1 u ) δ λ
and
0 = y 1 y 1 + y 1 y 1 δ y 1 λ y 1 2 + y 1 y 1 ( c δ ) + y 1 y 1 δ ( c + δ ) y 1 λ ( c + d ) + y 1 3 π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 1 2 y 2 π r k r π s k s π r π s + y 1 2 y 1 π r π s ( k s k r ) ( d + 2 δ ) ( d δ ) ( π r π s ) + y 1 2 y 2 c ( π r k r π s k s ) π r π s + y 1 2 ( c ( d + δ ) δ 2 + 3 λ π r π s ( k s k r ) ( d δ ) ( π r π s ) ) + y 1 y 2 y 1 2 δ ( π r k r π s k s ) π r π s + y 1 y 2 2 λ ( π r k r π s k s ) π r π s + y 1 y 1 2 δ ( 2 d δ ) π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 1 y 1 y 2 2 c δ ( π r k r π s k s ) π r π s + y 1 y 1 2 c d 2 δ c d δ 2 c δ 3 d δ + ( 2 d λ + 4 δ λ ) π r π s ( k s k r ) ( d δ ) ( π r π s ) y 1 y 2 2 c λ ( π r k r π s k s ) π r π s + y 1 c d 2 c d δ + 2 c δ 2 d 2 δ + 2 d δ 2 δ 3 d δ 3 λ π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 2 y 1 2 δ 2 ( π r k r π s k s ) π r π s + y 2 y 1 2 δ λ ( π r k r π s k s ) π r π s + y 2 λ 2 ( π r k r π s k s ) π r π s + y 1 3 d δ 2 π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 1 2 y 2 c δ 2 ( π r k r π s k s ) π r π s + y 1 2 2 c d δ 2 + ( 2 d λ + δ λ ) π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 1 y 2 2 c δ λ ( π r k r π s k s ) π r π s + y 1 c δ ( d + δ ) λ ( d 2 δ ) π r π s ( k s k r ) ( d δ ) ( π r π s ) + y 2 c λ 2 ( π r k r π s k s ) π r π s + λ 2 c δ + λ π r π s ( k s k r ) ( d δ ) ( π r π s ) .
Solving the differential polynomials (5) and (6) is equivalent to solving the within-host model (1) for y 1 ( t , p ) and y 2 ( t , p ) . Therefore, the definition of the structural identifiability within differential algebra approach becomes verifying that the coefficients of the differential polynomials (5) and (6) are one-to-one with respect to the parameters [18,19,20,33].
Definition 2.
Let c ( p ) denote the coefficients of the input–output Equations (5) and (6), where p is the vector of model parameters. The within-host model is said to be structurally identifiable, that is, it is structured to reveal its parameters from the observations y 1 ( t , p ) and y 2 ( t , p ) if and only if
c ( p ) = c ( p ^ ) p = p ^
According to Definition 2, we must show that the mapping of the parameter space to the coefficients is one-to-one. Let us suppose that there is a parameter vector p ^ = ( λ ^ , δ ^ , c ^ , d ^ , k ^ s , k ^ r , π ^ r , π ^ s , u ^ ) that produces the same target cell and viral load observations. Then, setting c ( p ) = c ( p ^ ) , we obtain the following system of nonlinear equations:
δ = δ ^ λ = λ ^ c δ = c ^ δ ^ c = c ^ c + d = c ^ + d ^ d = d ^ k r k s ( π r π s ) ( 1 u ) π r ( k r k s u ) k s π s ( 1 u ) = k ^ r k ^ s ( π ^ r π ^ s ) ( 1 u ^ ) π ^ r ( k ^ r k ^ s u ^ ) k ^ s π ^ s ( 1 u ^ ) ( k s k r ) π r ( k r k s u ) k s π s ( 1 u ) = ( k ^ s k ^ r ) π ^ r ( k ^ r k ^ s u ^ ) k ^ s π ^ s ( 1 u ^ ) π r π s ( k s k r ) π r π s = π ^ r π ^ s ( k ^ s k ^ r ) π ^ r π ^ s π r k r π s k s π r π s = π ^ r k ^ r π ^ s k ^ s π ^ r π ^ s
Here, we note that the DAISY addresses the nonlinear system of Equations (8) by substituting random integer values for the parameters p ^ . However, those solutions given by DAISY are not clearly stating the parameter correlations. Furthermore, 0 u < 1 is not an integer; such integer solutions can not be accepted. Therefore, we used Wolfram Mathematica to solve this nonlinear system (8) and obtained the following:
S 1 = { λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , k r = k s , π r = π s }
S 2 = λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , k r = k ^ r , π s = ( k ^ r k ^ s ) π ^ r π ^ s ( k ^ r π ^ r k s π ^ r + k s π ^ s k ^ s π ^ s ) , π r = π ^ r , u = ( k ^ r π ^ r ( k s k ^ s ( 1 u ^ ) ) + k ^ s ( k ^ s π ^ s ( 1 u ^ ) k s ( π ^ s ( 1 u ^ ) + π ^ r u ^ ) ) ) ( k s ( k ^ r k ^ s ) π ^ r )
S 3 = λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , k r = k ^ s ( k ^ r π ^ r k ^ s π ^ s ) ( 1 u ^ ) ( k ^ r π ^ r + k ^ s π ^ s ( 1 u ^ ) + k ^ s π ^ r u ^ ) , π s = ( k ^ r k ^ s ) π ^ r π ^ s ( k ^ r π ^ r k s π ^ r + k s π ^ s k ^ s π ^ s ) , π r = ( π ^ s ( k ^ r π ^ r k ^ s π ^ s ( 1 u ^ ) k ^ s π ^ r u ^ ) ) ( k ^ r π ^ r k ^ s π ^ s ) , u = ( k s k ^ r ) ( k ^ r π ^ r k ^ s π ^ s ) k s π ^ s ( k ^ r k ^ s )
We obtained three sets of solutions; S 1 , S 2 , and S 3 , each displaying several parameter correlations. The parameters λ , δ , c , and d can be identified, but complex correlations exist between the parameters k s , k r , π s , π r , and u. It is clear that the within-host model is not structurally identifiable. Upon further inspection, we notice that the solution set S 3 is inadmissible. To show that S 3 is not an acceptable solution, we multiply k r and π r in S 3 and obtain the following:
k r × π r = ( k ^ s ( k ^ r π ^ r k ^ s π ^ s ) ( 1 u ^ ) ) ( k ^ r π ^ r + k ^ s π ^ s + k ^ s π ^ r u ^ k ^ s π ^ s u ^ ) × ( π ^ s ( k ^ r π ^ r + k ^ s π ^ s ( 1 + u ^ ) k ^ s π ^ r u ^ ) ) ( k ^ r π ^ r k ^ s π ^ s ) = k ^ s ( 1 u ^ ) ( k ^ r π ^ r k ^ s π ^ s k ^ s π ^ r u ^ + k ^ s π ^ s u ^ ) × ( π ^ s ( k ^ r π ^ r + k ^ s π ^ s ( 1 + u ^ ) k ^ s π ^ r u ^ ) ) = k ^ s ( 1 u ^ ) ( k ^ r π ^ r + k ^ s π ^ s ( 1 + u ^ ) k ^ s π ^ r u ^ ) × ( π ^ s ( k ^ r π ^ r + k ^ s π ^ s ( 1 + u ^ ) k ^ s π ^ r u ^ ) ) = k ^ s π ^ s ( 1 u ^ )
We obtain k r π r = k ^ s π ^ s ( 1 u ^ ) , which means that k r π r < 0 since 0 < u ^ < 1 . As such, either k r or π r must be negative, which is not possible since all parameters in our model have positive values. Therefore, solution set S 3 is not an admissible solution. We summarize the structural identifiability results in the following Proposition 1.
Proposition 1.
As evident by solution sets S 1 and S 2 , the within-host model (1) is not structured to reveal its parameters from the observations of target cell count, y 1 ( t ) = T ( t ) + T s ( t ) + T r ( t ) , and viral load measurements, y 2 ( t ) = V r ( t ) + V s ( t ) .
Our goal is to obtain a structurally identifiable model from the observations of target cell count and viral load measurements. We can achieve this goal in two different ways and present these two ways in this study: (1) fix some parameters so that the remaining parameters become structurally identifiable or (2) add more information, namely, observations, to the identifiability analysis. Let us explore the first approach. First, we fix the rate at which infected cells with sensitive HIV produce new viral particles per day, namely, the parameter π s . We also impose that the rate at which infected cells with resistant HIV produce new viral particles is different from the infected cells with sensitive HIV. Therefore, we add π s = π ^ s and π r π s to the system of nonlinear equations c ( p ) = c ( p ^ ) given in (8). We solved the system in Mathematica and obtain the following solution:
L 1 = { λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , u = u ^ , π s = π ^ s , k s = k ^ s , k r = k ^ r , π r = π ^ r } L 2 = { λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , k s = k ^ s , π s = π ^ s , k r = k ^ s ( k ^ r π ^ r k ^ s π ^ s ) ( 1 u ^ ) π ^ r ( k ^ s u ^ k ^ r ) + k ^ s π ^ s ( 1 u ^ ) , π r = π ^ r ( π ^ r ( k ^ s u ^ k ^ r ) + k ^ s π ^ s ( 1 u ^ ) ) k ^ r π ^ r k ^ s π ^ s , u = 1 k ^ r π ^ r k ^ s π ^ s }
A closer look at the solution set L 2 reveals that π r k r = k ^ s π ^ r ( 1 u ^ ) . Since u is less than one and all the parameters are positive, the equality gives the product of the parameters π r and k r to be negative, which is not possible. Therefore, the set L 2 is not admissible for the HIV model (1). We can only accept solution set L 1 , which yields a structurally identifiable model. We summarize the results in the following Proposition 2.
Proposition 2.
If the viral busting rate π s of CD4+ T-cells infected with sensitive virus is known, and if we further impose that π s and π r are not identical, then the within-host model (1) is structured to reveal its parameters from the observations of target cell count and viral load measurements.
Now, we explore the second approach to obtaining a structurally identifiable within-host model, that is, adding more data, hence observations, to the identifiability analysis. In an AIDS clinical study, it is not possible to measure the infected CD4+ T cells separately, meaning that the number of CD4+ T cell count includes both healthy and infected cells. On the other hand, we found that in this AIDS clinical studies [3,8], drug-sensitive and drug-resistant viral loads are measured separately. Therefore, the observations in this AIDS clinical study take the form of the number of target cells, denoted by y 1 ( t , p ) , the drug-sensitive viral load, denoted by y 2 ( t , p ) , and the drug-resistant viral load, denoted by y 3 ( t , p ) . With regard to the model (1) variables, the observations can be written as follows:
y 1 ( t , p ) = T ( t ) + T s ( t ) + T r ( t ) and y 2 ( t , p ) = V s ( t ) and y 3 ( t , p ) = V r ( t )
For within-host model (1) and the observations (9), DAISY gives the following set of input–output equations:
0 = y 1 y 2 ( d δ ) π s y 3 ( d δ ) π r + y 1 d y 2 c ( d δ ) π s y 3 c ( d δ ) π r λ 0 = y 1 + y 1 y 2 k s + y 1 y 3 k r + y 1 ( d + δ ) + y 1 y 2 k s δ + y 1 y 3 k r δ + y 1 d δ y 2 k s λ y 3 k r λ δ λ 0 = y 1 y 2 k s π s ( 1 u ) ( d δ ) + y 2 + y 2 ( c + δ ) + y 1 y 2 k s δ π s ( 1 u ) d δ + y 2 ( k s λ π s u k s λ π s + c d δ c δ 2 ) d δ
Applying Definition 2, we set c ( p ) = c ( p ^ ) and obtain the following solution:
λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , u = u ^ , π s = π ^ s , k s = k ^ s , k r = k ^ r , π r = π ^ r .
This means that the within-host model (1) is structured to reveal its parameters from the observations of total CD+ T-cell count, drug-sensitive HIV load, and drug-resistant HIV load. We state the following Proposition 3.
Proposition 3.
The within-host model (1) is structurally identifiable from the observations of target cell count, y 1 ( t ) = T ( t ) + T s ( t ) + T r ( t ) , drug-sensitive viral load y 2 ( t ) = V s ( t ) , and drug-resistant viral load y 3 ( t ) = V r ( t ) measurements.

4. Parameter Estimation and Data

4.1. Estimating Model Parameters from AIDS Clinical Trial Data Set with Nine Patients

Data Set 1: To estimate the parameters of the within-host model (1), we used the data from the Stanford HIV Drug Resistance Database [34]. Specifically, we used the data collected from the AIDS Clinical Trial Groups (ACTG) 5257 [35]. The ACTG 5257 study was performed on treatment-naive people over the age of 18 who had HIV-1 RNA levels of over 1000 copies/mL. More than 1800 participants were enrolled in the study with a median age of 37, 24 % being women. CD8 T cell count was not reported in the study. Virologic failure was defined as RNA levels greater than 1000 copies/mL after 16 weeks and before 24 weeks or 200 copies/mL at or after 24 weeks. It is reported that among those who received the non-nucleoside reverse transcriptase inhibitor (NNRT) Raltegravir, 3 % experienced virologic failure as a result of drug resistance [35]. From the data set of this study, we chose 9 patients who were on NNRT and protease inhibitor (PI) regimens and showing viral failure. However, it is important to note that there is no evidence that these specific 9 patients experienced virologic failure as a result of developing resistance. In principle, virologic failure may occur as a result of different reasons such as the patient not following the prescribed medication regimen.
  • Estimating Model Parameters From Data Set 1: Since we had data from 9 HIV infected individuals from the AIDS clinical data set [35], to estimate the parameters of the within-host model (1), we used the nonlinear mixed effect modeling approach. The nonlinear mixed effect model is defined
    y i , j 1 = y 1 ( t j , p i ) + c 1 y 1 ( t j , p i ) ξ ϵ i , j ϵ i j N ( 0 , σ 2 )
    y i , j 2 = y 2 ( t j , p i ) + c 2 y 2 ( t j , p i ) ξ ϵ i , j ϵ i j N ( 0 , σ 2 )
    where y 1 ( t j , p i ) is the model prediction of total CD4+ T-cell counts at time t j of the ith individual, and y i , j 1 is the CD4+ T-cell count of the ith individual at time t j . Similarly, y 2 ( t j , p i ) is the model prediction of the total viral load on log scale at time t j of the ith individual, and y i , j 2 is the log scale viral load data of the ith individuals at time t j . The terms c 1 y 1 ( t j , p i ) ξ and c 2 y 2 ( t j , p i ) ξ represent the statistical error models. In most cases, ξ = 0 , which corresponds to the constant (or additive) error model, and when ξ = 1 , it is called the relative (proportional) error model. We assume that the total CD4+ T-cell count follows an additive error model, while the viral load data follow a relative error model. The p i = ( λ , δ , c , d , k s , k r , π r , π s , u ) is the parameter vector for the ith individual. The random effect is then defined as
    p i = p ˜ + η i η i N ( 0 , ω i 2 )
    where p ˜ is the fixed population parameter, and η i is the random effect. The individual parameters p i follow a normal distribution whose mean is the fixed population parameter p ˜ , and the standard deviation is ω i .
We use the stochastic approximation estimation-maximization (SAEM) algorithm in Monolix [21] to estimate the mean and standard deviation of parameters p i . We assume that the individual parameters p i are log-normally distributed, with mean ln ( p ˜ ) and standard deviation ω i . We set the initial drug-resistant viral load to zero and further set the initial number of infected cells with sensitive and resistant viruses to zero. Hence, V r ( 0 ) = 0 viral RNA copies per mL, T s ( 0 ) = 0 , T r ( 0 ) = 0 cells per µL. Seven of the nine HIV-positive patients had a known total target cell count at time t = 0, which ranges between 20 and 308 cells per microliter. Therefore, we set the initial target cell count to the mean of the 7 patients and set T ( 0 ) = 178 cells per µL, and we set the initial sensitive viral load, V s ( 0 ) = 5204600 viral RNA copies/mL, to the mean of the 9 patients. The structural identifiability analysis indicates that fixing the parameter π s is necessary to determine the remaining parameters uniquely. Therefore, we set π s = 4.57 viral RNA copies per cell per day. The estimated values for each patient ( p i ) and the fixed population parameters ( p ˜ ) and their standard deviations ( ω i ) are presented in Table 2.
Table 2 shows that infected CD4+ T cells die at a higher rate than uninfected cells, which is consistent with prior findings. The half-lives of uninfected cells are 92 days on average, and infected cells are 46 days, based on the fixed population estimates d ˜ and δ ˜ ( ln ( 2 ) / 0.0075 = 92 and ln ( 2 ) / 0.015 = 46 ). The clearance rate of HIV is c ˜ = 2.42 per day, giving a half-life of 7 h. For the 9 HIV-infected individuals, the production rate of CD4+ T cells ranged from 1.46 to 8.52 cells per day per milliliter of blood. CD4+ cells infected with a resistant virus produce 14.12 viral RNA copies per cell per day, higher than the cells infected with a sensitive virus. The infection rate of the resistant virus ( k ˜ r = 2.2 × 10 5 per viral particle per day in a mL blood) was larger than the infection rate of the sensitive virus ( k ˜ s = 1.7 × 10 5 per viral particle per day in a mL blood). All nine HIV patients had an estimated mutation rate of zero. This implies that mutation occurs at a single time rather than continually. Drug-resistance mutations are probably best modeled as a dirac-delta function.
Figure 2 shows predictions for each patient using the within-host model (1) with the parameters given in Table 2. We present the total viral load prediction by the within-host model (1) in Figure 2a, resistant viral load prediction in Figure 2b, and sensitive viral load in Figure 2c. After starting NNRT and PI antiviral therapy, patients’ total viral load clearly stays below 100 (2 on log scale) viral RNA copies. However, about two years into the treatment, virologic failure is detected (see dashed lines in Figure 2a). Furthermore, in all nine individuals, the resistant virus fully eliminated the sensitive virus. Despite following the same NNRT and PI regimen, the nine patients had significantly different total CD4+ cell counts (see Figure 3).
The dashed line in Figure 2a indicates the time of virologic failure as defined by the clinical trial authors [35]. Specifically, it shows the point at which, after 24 weeks of antiretroviral therapy, the overall viral load surpasses 200 viral RNA levels. However, the point at which the patient first exhibits the resistant virus is marked by the dashed line in Figure 2b. We see at least a 100-day relapse between the initial emergence of the resistant virus and the detection of virologic failure for each patient. We present in Figure 4 the distribution of these time points. Patients develop resistant viruses on average 844 days after the initiation of the treatment, but detection occurs 970 days after.

Practical Identifiability Analysis

Monolix provides the distribution of the parameters obtained by fitting the within-host model to the AIDS clinical trial data (see Figure 5), as well as the correlations between the random effects η (see Figure A1). Figure A1 indicates that the parameter λ is correlated with c , d , k r and π r . These parameters also exhibit bimodal distributions (refer to Figure 5), particularly for λ and c. The parameters λ and c have the highest Pearson correlation coefficient ( 0.88 ). The correlation between parameters λ and d is substantial with a Pearson coefficient of 0.64 . The Pearson correlation between parameters λ and k r is 0.51 , while that for λ and π r is 0.31 . The correlation threshold for claiming that a parameter is practically unidentifiable based on the Pearson coefficient is unclear. Based on our earlier studies [30], we claim that the parameters λ and c are practically unidentifiable from fitting the within-host model (1) to the AIDS clinical trial data with nonlinear mixed effect modeling.

4.2. Estimating Model Parameters from Published Data with Single HIV Patient

  • Data Set 2: Our structural identifiability analysis shows that measuring sensitive and resistant viral loads separately can uniquely determine the parameters of the within-host model (1). However, we failed to find any AIDS clinical trials in the Stanford HIV Drug Resistance Database [34] that assessed both resistant and sensitive viruses individually. On the other hand, we did find a clinical trial with data published in [3] where the resistant and sensitive virus were measured separately. We used the published data in [3] where HIV-infected individuals were prescribed with the non-nucleoside reverse transcriptase (NNRT) inhibitor Nevirapine (NVP), and their response to the medication was assessed at specific time points: 0, 14, 28, 56, and 140 days after initiating therapy. During these evaluations, the total CD4+ T-cell count per µL was measured, along with the presence of HIV strains (per µL plasma) that are sensitive or resistant to NVP [3].
We fit the within-host model (1) using two separate data sets, each with distinct characteristics. To begin, in clinical study [35], data are available for around 1400 days, nearly 4 years, whereas in study [3], data are only available for 140 days (roughly 0.4 year). While it takes two years of treatment for the viral failure to develop in [35], in [3] it happens in just fourteen days. Furthermore, the total CD4+ cell count in [3] is very small compared to the same data in [35]. The HIV viral RNA copies are measured in a mililiter of blood in [35] and in µL of blood in [3]. Since the main objective of this study was to perform an identifiability analysis of drug-resistant model parameters using clinical data sets, we proceeded with the available data sets [3,35].
  • Estimating Model Parameters from Data Set 2: To estimate the within-host model (1) parameters, we fit the predicted total CD4+ cell count and drug-resistant and drug-sensitive viral loads to the published data in [3]. Simply put, we minimized the Euclidean distances between the model predictions and the data. Specifically, we minimized the following objective function with constraints to estimate the model parameters.
    p ^ = min p j = 1 4 ( y 1 ( t j , p ) y 1 , j ) 2 1 / 2 + j = 1 4 ( y 2 ( t j , p ) y 2 , j ) 2 1 / 2 + j = 1 4 ( y 3 ( t j , p ) y 3 , j ) 2 1 / 2 s . t . δ d and π r π s and l b p u b
    where l b and u b are the lower and upper bounds for the parameters p, respectively. We assume known initial conditions. The initial conditions are determined by the data at the initiation of the treatment. As before, we set T s ( 0 ) = 0 , T r ( 0 ) = 0 cells per µL, and V r ( 0 ) = 0 RNA copies per µL. The total CD4+ cell count at t = 0 is 116 cells per µL; therefore, we set T ( 0 ) = 116 , and the total viral load at t = 0 is 130 viral RNA copies per µL, that is, V s ( 0 ) = 130 . Because the drug-sensitive virus decreases significantly during the first 14 days of the therapy and the drug-resistant virus emerges after 14 days, we assume that the drug-resistant parameters are time-dependent. Moreover, we suppose that the parameters k r , π r , and u vary with time. They are specifically described as the step functions listed below.
    k r ( t ) = 0 t < 14 k r t 14 π r ( t ) = 0 t < 14 π r t 14 u ( t ) = 0 t < 14 u t 14
We numerically solved (12) using fmincon in Matlab R2024a Update 3, and we present the optimal values p ^ in Table 3. The within-host model predictions with the estimated values are presented in Figure 6.

4.3. Structural Identifiability Analysis of Within-Host Model with Time-Dependent Parameters

To estimate the parameters from the second data set [3], we used the following within-host model with time-dependent parameters (13),
d T d t = λ d T k s V s T k r ( t ) V r T d T s d t = ( 1 u ( t ) ) k s V s T δ T s d V s d t = π s T s c V s d T r d t = u ( t ) k s V s T + k r ( t ) V r T δ T r d V r d t = π r ( t ) T r c V r
Since the time-dependent parameters are only step functions, the model (14) can be considered as a coupled ODE system with constant coefficients (see Figure 7). Therefore, we study the structural identifiability of the model (14) by studying the structural identifiability of the coupled system with constant ODEs. When t < 14 , the within-host model with a time-dependent parameter reduces to the following system with constant coefficients:
d T c d t = λ d T c k s V T c d I c d t = k s V T c δ I c d V d t = π s I c c V
where T c ( t ) denotes the number of target cells, I c ( t ) denotes the number of infected target cells and V ( t ) denotes the HIV. The system starts with initial conditions T c ( 0 ) = 116 , I c ( 0 ) = 0 , and V ( 0 ) = 130 . Then, the within-host model (1) with time-dependent parameters (13) starts at time t = 14 with initial conditions T ( 14 ) = T c ( 14 ) , T s ( 14 ) = I c ( 14 ) , V s ( 14 ) = V ( 14 ) , T r ( 14 ) = 0 , V r ( 14 ) = 0 . We studied the structural identifiability analysis of the within-host model (1) from the observations of total target cell count, y 1 ( t ) = T ( t ) + T s ( t ) + T r ( t ) , drug-sensitive viral load y 2 ( t ) = V s ( t ) , and drug-resistant viral load y 3 ( t ) = V r ( t ) observations in Section 3. When t < 14 , the model reduces to (15); therefore, we need to study the structural identifiability of the within-host model (15) from the observations of y 1 ( t ) = T c ( t ) + I c ( t ) and viral load y 2 ( t ) = V ( t ) . We follow the same steps as in Section 3 and first obtain the input–output equations:
0 = y 1 + y 2 ( δ d ) π s + y 2 ( δ d ) π s c + y 1 d λ 0 = y 2 y 2 y 2 k s y 2 ( c + δ ) + y 2 2 k s c + y 1 y 2 π s k s y 2 c δ
Solving c ( p ) = c ( q ) , we obtain a unique solution
λ = λ ^ , δ = δ ^ , c = c ^ , d = d ^ , π s = π ^ s , k s = k ^ s .
Hence, we conclude that model (15) is structurally identifiable from the observations relevant to it.

4.4. Practical Identifiability Analysis of Within-Host Model Parameters from Data Set 2

To study whether the parameters can be identified from the Data Set 2, we performed Monte Carlo simulations. We set the estimated parameters p ^ from constraint optimization problem (12) as the true parameter set and assumed that the data satisfy the following statistical error models:
y 1 , j = y 1 ( t j , p ^ ) + ϵ j
y 2 , j = y 2 ( t j , p ^ ) + ϵ j
y 3 , j = y 3 ( t j , p ^ ) + ϵ j where ϵ j N ( 0 , σ 2 )
where y 1 ( t j , p ^ ) is the model prediction of total CD4+ T-cell count at time t j and y 1 , j is the CD4+ T-cell count data at time t j , where t j = { 14 , 28 , 56 , 140 } . Similarly, the y 2 ( t j , p ^ ) , and y 3 ( t j , p ^ ) are the model prediction of the drug-sensitive and drug-resistant viral load at time t j , respectively; y 2 , j and y 3 , j are the drug-sensitive and drug-resistant viral load data at time t j , respectively. Simply, we assume that the measured data follow a normal distribution whose mean is the model prediction and standard deviation is σ . We generate M = 1000 data sets using the error models (16)–(18) with increasing noise levels by setting σ = 0.01 , 0.05 , 0.1 , and 0.2 . Then, we fit each of the 1000 data sets to the within-host model by minimizing (12). This would result in 1000 parameter estimates for each noise level. Since, the true parameter set that generated the data set is known, we can compute the Average Relative Estimation (ARE) error by
A R E ( p ) = 100 % × 1 M j = 1 M | p ^ p j | | p ^ |
where p ^ is the true parameter vector, and p j is the estimated parameter vector from the data set j with j = 1 , 2 , , M = 1000 . We use A R E to determine whether a parameter is practically identifiable or not by using the following Definition 3 [28,36].
Definition 3.
Let A R E be the average relative estimation error of the parameter p. The practical identifiability of parameter p is determined by comparing A R E to the measurement error.
i.
If 0 A R E σ , then parameter p is (strongly) practically identifiable;
ii.
If σ < A R E 10 σ , then parameter p is weakly practically identifiable;
iii.
If 10 σ < A R E , then parameter p is not practically identifiable.
A model is said to be practically identifiable when all parameters p of the model are practically identifiable(Table 4).
Following Definition 3, we claim that the parameters λ , π r , c , and δ are practically identifiable, and the parameters d , k s , k r , u , and π s are weakly practically identifiable (Table 5).
From Definition 3, we claim that the parameters λ , π r , c , δ , d , k s , k r , π s are practically identifiable, while the parameter u is weakly practically identifiable. This shows that collecting more frequent data drastically improves the identifiability of parameters, which, in turn, strengthens the confidence in our results.

5. Discussion

In this paper, we consider a sensitive strain-resistant strain model of HIV, which has been previously discussed in [8,9]. The model has been thoroughly analyzed [9] but has never been fit to data. Our goal here was to connect the model to within-host HIV data. We used two data sets: Data Set 1, which is an excerpt of a large data set of Stanford Drug Resistance Clinical Trial group [35], and Data Set 2, which consists of data published in [3]. Data Set 2 is an older data set of the occurrence of resistance with respect to one drug and measures both the sensitive- and resistant-strain viral load as well as CD4+ cells. Data Set 1 is a newer data set of virologic failure when patients are on ART, that is, multiple drugs in multiple classes of drugs, and it measures only the total viral load and CD4+ cells.
The problem of fitting models to data is generally an ill-posed problem [19], meaning that multiple parameter combinations may result in the same fits. So, this problem needs to be studied rigorously to understand the accuracy of our parameter estimations. Thus, our first goal was to study the structural identifiability of our model with respect to each data set. We found that our model is structurally identifiable with respect to Data Set 1 if π s is known and fixed, and π r is distinct from π s . On the other hand, our model is structurally identifiable with respect to Data Set 2 with no conditions. These results hold regardless of the quality of the data given.
We fit the model to the two data sets in different models. From Data Set 1, we had nine patients exhibiting virologic failure; we use mixed effects modeling and Monolix [21] to fit all nine patients simultaneously and infer “population-level” parameters. Monolix gives correlations of parameters, and we concluded that parameter λ is correlated with c, d, k r , and π r . Thus, according to Monolix, the above parameters are not practically identifiable, but the level at which they can be determined remains unknown. We fit our model to Data Set 2 using least squares. We performed Monte Carlo simulations to gauge the practical identifiablity of the parameters. We conclude that parameters λ , π r , c , δ are practically identifiable (that is, identifiable within the measurement error), parameters π s , d , k s , k r are weakly practically identifiable (that is identifiable within 10 times the measurement error), and u is not identifiable. We surmise that if u = u ( t ) is a delta function, it may make the remaining parameters identifiable, but this needs to be investigated in the future. Mutation as a delta function has been considered in an epidemiological context [37].
With reasonably reliably estimated parameters, we can draw the following conclusions. First, Data Set 1 does not give the viral load of the sensitive and resistant (to all drugs) strains separately; however, we are now capable with the help of the model to infer the viral load of sensitive (to at least one drug) strains and the resistant-to-all-drugs strain. Second, since Data Set 1 does not give data on the sensitive and resistant strains separately, it is not clear from the data when the drug-resistant strain emerges during treatment. We can now estimate this moment patient-by-patient or by a population-level average. The population-level average of the time of emergence of the resistant strain is 844 days after treatment regiment starts. In this case, the population-level virologic failure occurs at 970 days after the treatment’s start date. Because in Data Set 2 sensitive and resistant strains’ viral loads are given separately, the time of emergence of the resistant strain can be inferred from the data. This happens at day 14 after the start of treatment. From this comparison, we see how much more superior ART is to the single-drug treatment used in the early days of HIV. However, because with multiple drugs used at the same time, resistant strains of each drug can occur and collecting these type of detailed data is difficult or impossible, it is imperative to develop tools, such as the ones in this article, to be able to decipher that information from data on total viral load and total CD4+ cells.

Author Contributions

Conceptualization, N.T. and M.M.; methodology, N.T. and M.M.; software, V.S., K.G., A.G. and M.Z.; validation, N.T.; formal analysis, N.T.; investigation, V.S., K.G., A.G. and M.Z.; resources, V.S., K.G., A.G. and M.Z.; data curation, V.S., K.G., A.G. and M.Z.; writing—original draft preparation, N.T. and M.M.; writing—review and editing, N.T. and M.M.; visualization, N.T.; supervision, N.T.; project administration, N.T.; funding acquisition, N.T. and M.M. All authors have read and agreed to the published version of the manuscript.

Funding

NT acknowledges partial support from National Science Foundation (NSF) grant DMS 1951626 and National Institute of Health (NIH) NIGMS 1R01GM152743-01.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Data Set 1

Figure A1. Monolix result with the correlation between random effects. Pearson’s correlation coefficient between the random effect η λ and η c is 0.88. Pearson’s correlation coefficient between the random effect η λ and η d is 0.64. Pearson’s correlation coefficient between the random effect η λ and η k r is −0.51. Pearson’s correlation coefficient between the random effect η λ and η π r is −0.31.
Figure A1. Monolix result with the correlation between random effects. Pearson’s correlation coefficient between the random effect η λ and η c is 0.88. Pearson’s correlation coefficient between the random effect η λ and η d is 0.64. Pearson’s correlation coefficient between the random effect η λ and η k r is −0.51. Pearson’s correlation coefficient between the random effect η λ and η π r is −0.31.
Mathematics 12 02583 g0a1

References

  1. HIV.gov. A Timeline of HIV and AIDS. Available online: https://www.hiv.gov/hiv-basics/overview/history/hiv-and-aids-timeline/ (accessed on 12 June 2024).
  2. Larder, B.A.; Kemp, S.D. Multiple mutations in HIV-1 reverse transcriptase confer high-level resistance to zidovudine (AZT). Science 1989, 246, 1155–1158. [Google Scholar]
  3. Nowak, M.A.; Bonhoeffer, S.; Shaw, G.M.; May, R.M. Anti-viral drug treatment: Dynamics of resistance in free virus and infected cell populations. J. Theor. Biol. 1997, 184, 203–217. [Google Scholar] [CrossRef]
  4. Larder, B.A.; Kemp, S.D.; Harrigan, P.R. Potential mechanism for sustained antiretroviral efficacy of AZT-3TC combination therapy. Science 1995, 269, 696–699. [Google Scholar] [PubMed]
  5. National HIV Curriculum. Evaluation and Management of Virologic Failure. Available online: https://www.hiv.uw.edu/go/antiretroviral-therapy/evaluation-management-virologic-failure/core-concept/all (accessed on 1 June 2024).
  6. Rocheleau, G.; Brumme, C.J.; Shoveller, J.; Lima, V.D.; Harrigan, P.R. Longitudinal trends of HIV drug resistance in a large Canadian cohort, 1996–2016. Clin. Microbiol. Infect. 2018, 24, 185–191. [Google Scholar]
  7. Ngina, P.; Mbogo, R.W.; Luboobi, L.S. HIV drug resistance: Insights from mathematical modelling. Appl. Math. Model. 2019, 75, 141–161. [Google Scholar] [CrossRef]
  8. Nowak, M.A.; May, R.M. Virus Dynamics: Mathematical Principles of Immunology and Virology; Oxford University Press: Oxford, UK, 2000; pp. xii+237. [Google Scholar]
  9. Rong, L.; Feng, Z.; Perelson, A.S. Emergence of HIV-1 Drug Resistance during Antiretroviral Treatment. Bull. Math. Biol. 2007, 69, 2027–2060. [Google Scholar] [CrossRef] [PubMed]
  10. Rong, L.; Gilchrist, M.A.; Feng, Z.; Perelson, A.S. Modeling within-host HIV-1 dynamics and the evolution of drug resistance: Trade-offs between viral enzyme function and drug susceptibility. J. Theor. Biol. 2007, 247, 804–818. [Google Scholar]
  11. Rosenbloom, D.I.; Hill, A.L.; Rabi, S.A.; Siliciano, R.F.; Nowak, M.A. Antiretroviral dynamics determines HIV evolution and predicts therapy outcome. Nat. Med. 2012, 18, 1378–1385. [Google Scholar]
  12. Hill, A.L.; Rosenbloom, D.I.S.; Nowak, M.A.; Siliciano, R.F. Insight into treatment of HIV infection from viral dynamics models. Immunol. Rev. 2018, 285, 9–25. [Google Scholar] [PubMed]
  13. De Leenheer, P.; Pilyugin, S.S. Multistrain virus dynamics with mutations: A global analysis. Math Med Biol. 2008, 25, 285–322. [Google Scholar]
  14. Browne, C.J.; Smith, H.L. Dynamics of virus and immune response in multi-epitope network. J. Math. Biol. 2018, 77, 1833–1870. [Google Scholar]
  15. Dorratoltaj, N.; Nikin-Beers, R.; Ciupe, S.M.; Eubank, S.G.; Abbas, K.M. Multi-scale immunoepidemiological modeling of within-host and between-host HIV dynamics: Systematic review of mathematical models. PeerJ 2017, 5, e3877. [Google Scholar]
  16. Perelson, A.S.; Ribeiro, R.M. Modeling the within-host dynamics of HIV infection. BMC Biol. 2013, 11, 96. [Google Scholar]
  17. Eisenberg, M.C.; Robertson, S.L.; Tien, J.H. Identifiability and estimation of multiple transmission pathways in cholera and waterborne disease. J. Theor. Biol. 2013, 324, 84–102. [Google Scholar] [CrossRef] [PubMed]
  18. Tuncer, N.; Marctheva, M.; LaBarre, B.; Payoute, S. Structural and Practical Identifiability Analysis of Zika Epidemiological Models. Bull. Math. Biol. 2018, 80, 2209–2241. [Google Scholar] [CrossRef] [PubMed]
  19. Tuncer, N.; Gulbudak, H.; Cannataro, V.L.; Martcheva, M. Structural and practical identifiability issues of immuno-epidemiological vector-host models with application to Rift Valley Fever. Bull. Math. Biol. 2016, 78, 1796–1827. [Google Scholar]
  20. Miao, H.; Xia, X.; Perelson, A.S.; Wu, H. On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics. SIAM Rev. 2011, 53, 3–39. [Google Scholar] [CrossRef]
  21. Monolix, version 2019R2; Lixoft SAS: Antony, France, 2019.
  22. Pohjanpalo, H. System identifiability based on the power series expansion of the solution. Math. Biosci. 1978, 41, 21–33. [Google Scholar] [CrossRef]
  23. Vajda, S.; Godfrey, K.R.; Rabitz, H. Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Math. Biosci. 1989, 93, 217–248. [Google Scholar] [CrossRef]
  24. Walter, E.; Lecourtier, Y. Global approaches to identifiability testing for linear and nonlinear state space models. Math. Comput. Simul. 1982, 24, 472–482. [Google Scholar]
  25. Ljung, L.; Glad, T. On global identifiability for arbitrary model parametrizations. Automatica 1994, 30, 265–276. [Google Scholar] [CrossRef]
  26. Bellu, G.; Saccomani, M.; Audoly, S.; D’Angio, L. DAISY: A new software tool to test global identifiability of biological and physiological systems. Comput. Methods Programs Biomed. 2007, 88, 52–61. [Google Scholar]
  27. Raue, A.; Kreutz, C.; Maiwald, T.; Bachmann, J.; Schilling, M.; Klingmüller, U.; Timmer, J. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 2009, 25, 1923–1929. [Google Scholar]
  28. Heitzman-Breen, N.; Liyanage, Y.R.; Duggal, N.; Tuncer, N.; Ciupe, S.M. The Effect of Model Structure and Data Availability on Usutu Virus Dynamics at Three Biological Scales. R. Soc. Open Sci. 2024, 11, 231146. [Google Scholar] [CrossRef]
  29. Tuncer, N.; Martcheva, M. Determining reliable parameter estimates for within-host and within-vector models of Zika virus. J. Biol. Dyn. 2021, 15, 430–454. [Google Scholar]
  30. Tuncer, N.; Le, T.T. Structural and practical identifiability analysis of outbreak models. Math. Biosci. 2018, 299, 1–18. [Google Scholar] [PubMed]
  31. Miao, H.; Dykes, C.; Demeter, L.M.; Cavenaugh, J.; Park, S.Y.; Perelson, A.S.; Wu, H. Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments. Bull. Math. Biol. 2008, 70, 1749–1771. [Google Scholar] [PubMed]
  32. Kao, Y.H.; Eisenberg, M.C. Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment. Epidemics 2018, 25, 89–100. [Google Scholar] [CrossRef] [PubMed]
  33. Gupta, C.; Tuncer, N.; Martcheva, M. Immuno-epidemiological co-affection model of HIV infection and opioid addiction. Math. Biosci. Eng. 2022, 19, 3636–3672. [Google Scholar] [CrossRef]
  34. Rhee, S.Y.; Gonzales, M.; Kantor, R.; Betts, B.; Ravela, J.; Shafer, R.W. Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res. 2003, 31, 298–303. [Google Scholar] [CrossRef]
  35. Lennox, J.L.; Landovitz, R.J.; Ribaudo, H.J.; Ofotokun, I.; Na, L.H.; Godfrey, C.; Kuritzkes, D.R.; Sagar, M.; Brown, T.T.; Cohn, S.E.; et al. Efficacy and tolerability of 3 nonnucleoside reverse transcriptase inhibitor-sparing antiretroviral regimens for treatment-naive volunteers infected with HIV-1: A randomized, controlled equivalence trial. Ann. Intern. Med. 2014, 161, 461–471. [Google Scholar] [CrossRef] [PubMed]
  36. Sreejithkumar, V.; Ghods, K.; Bandara, T.; Martcheva, M.; Tuncr, N. Modeling the interplay between albumin-globulin metabolism and HIV infection. Math. Biosci. Eng. 2023, 20, 19527–19552. [Google Scholar] [PubMed]
  37. Martcheva, M. An evolutionary model of influenza A with drift and shift. J. Biol. Dyn. 2012, 6, 299–332. [Google Scholar] [PubMed]
Figure 1. Flow diagram describing the interaction between target cells ( T ) , infected cells with drug-resistant virus ( T r ) , infected cells with drug-sensitive virus ( T s ) , drug-sensitive virus ( V s ) , and drug-resistant virus ( V r ) .
Figure 1. Flow diagram describing the interaction between target cells ( T ) , infected cells with drug-resistant virus ( T r ) , infected cells with drug-sensitive virus ( T s ) , drug-sensitive virus ( V s ) , and drug-resistant virus ( V r ) .
Mathematics 12 02583 g001
Figure 2. Monolix fitting results for 9 HIV infected patients. In columns (ac), the red dots are the total viral load in log scale, including both the sensitive and resistant strains. (a) presents the within-host model (1) prediction of the total viral load in log scale (blue lines) together with the data. The dashed line gives the time when the total viral load exceeds 200 viral RNA copies per mL. (b) presents the within-host model (1) prediction of the resistant virus and the data. The dashed line gives the time when the resistant virus first appears in the system. (c) presents the within-host model (1) prediction of the sensitive virus together with the data.
Figure 2. Monolix fitting results for 9 HIV infected patients. In columns (ac), the red dots are the total viral load in log scale, including both the sensitive and resistant strains. (a) presents the within-host model (1) prediction of the total viral load in log scale (blue lines) together with the data. The dashed line gives the time when the total viral load exceeds 200 viral RNA copies per mL. (b) presents the within-host model (1) prediction of the resistant virus and the data. The dashed line gives the time when the resistant virus first appears in the system. (c) presents the within-host model (1) prediction of the sensitive virus together with the data.
Mathematics 12 02583 g002
Figure 3. Monolix fitting results for 9 HIV infected patients. In columns (ac), the red dots are the total CD4+ cell count, including the cells infected with both the sensitive and resistant strains. In columns (ac), the blue curve is the within-host model (1) prediction of the total CD4+ cell count.
Figure 3. Monolix fitting results for 9 HIV infected patients. In columns (ac), the red dots are the total CD4+ cell count, including the cells infected with both the sensitive and resistant strains. In columns (ac), the blue curve is the within-host model (1) prediction of the total CD4+ cell count.
Mathematics 12 02583 g003
Figure 4. The left-hand-side figure presents the distribution of the time points at which the resistant virus first appeared (see Figure 2b). The right-hand-side figure presents the distribution of the time points at which virologic failure is detected (see Figure 2a). Bars represent the empirical distributions, and the red curve is the theoretical distribution. Dashed lines are the mean of the distributions.
Figure 4. The left-hand-side figure presents the distribution of the time points at which the resistant virus first appeared (see Figure 2b). The right-hand-side figure presents the distribution of the time points at which virologic failure is detected (see Figure 2a). Bars represent the empirical distributions, and the red curve is the theoretical distribution. Dashed lines are the mean of the distributions.
Mathematics 12 02583 g004
Figure 5. Parameter distribution generated by fitting within-host model (1) to the total viral load and total CD4+ cell count data.
Figure 5. Parameter distribution generated by fitting within-host model (1) to the total viral load and total CD4+ cell count data.
Mathematics 12 02583 g005
Figure 6. (a) Model predictions (blue curves) of the drug-sensitive viral load with drug-sensitive viral load data (red circles). (b) Model predictions (blue curves) of the drug-resistant viral load with drug-resistant viral load data (red circles). (c) Model predictions (blue curves) of the total CD4+ cell count with total CD4+ cell count data (red circles).
Figure 6. (a) Model predictions (blue curves) of the drug-sensitive viral load with drug-sensitive viral load data (red circles). (b) Model predictions (blue curves) of the drug-resistant viral load with drug-resistant viral load data (red circles). (c) Model predictions (blue curves) of the total CD4+ cell count with total CD4+ cell count data (red circles).
Mathematics 12 02583 g006
Figure 7. The within-host model (14) with time-dependent parameters (13) as a coupled system of ODEs with constant parameters.
Figure 7. The within-host model (14) with time-dependent parameters (13) as a coupled system of ODEs with constant parameters.
Mathematics 12 02583 g007
Table 1. Definitions of the HIV within-host model (1) parameters and state variables and their units.
Table 1. Definitions of the HIV within-host model (1) parameters and state variables and their units.
ParameterDefinitionUnits
λ Recruitment rate of uninfected target cells CD 4 + cells mL 1 day 1
k s Infection rate of target cells by drug-sensitive virus mL vRNA copies 1 day 1
k r Infection rate of target cells by drug-resistant virus mL vRNA copies 1 day 1
uMutation rate from sensitive strain to resistant strain dimensionless
π r Burst size of drug-resistant strain 10 3 vRNA copies ( CD 4 + cells day ) 1
π s Burst size of drug-sensitive strain 10 3 vRNA copies ( CD 4 + cells day ) 1
cClearance rate of free virus day 1
δ Death rate of infected cells day 1
State VariableDefinitionUnits
TPopulation of Target (CD4+) Cells CD 4 + cells per µL
T r Population of Infected Target (CD4+) Cells with Drug-Resistant Virus CD 4 + cells per µL
T s Population of Infected Target (CD4+) Cells with Drug-Susceptible Virus CD 4 + cells per µL
V r Drug-Resistant Viral Load vRNA copies mL 1
V s Drug-Susceptible Viral Load vRNA copies mL 1
Table 2. Monolix parameter estimation results for each HIV-infected individual and the fixed population parameters and their standard deviations.
Table 2. Monolix parameter estimation results for each HIV-infected individual and the fixed population parameters and their standard deviations.
Patient ID λ d k s k r π r c δ u
2601083.14320.00521.7003 × 10−52.0280 × 10−514.29232.66980.01512.2205 × 10−16
2601768.52430.01461.7020 × 10−52.0086 × 10−513.96163.07080.01532.2205 × 10−16
2601412.25690.00701.7086 × 10−52.2728 × 10−514.86271.57930.01532.2205 × 10−16
2600515.35880.00861.7076 × 10−52.2009 × 10−514.31023.10060.01522.2205 × 10−16
2600556.71390.00771.7129 × 10−52.0807 × 10−513.54094.11480.01532.2205 × 10−16
2602751.45730.00691.7190 × 10−52.4921 × 10−516.36911.02950.01552.2205 × 10−16
2602192.18110.00571.7189 × 10−52.0317 × 10−513.86181.79390.01542.2205 × 10−16
2602226.21640.00821.7144 × 10−51.9117 × 10−512.96153.58860.01522.2205 × 10−16
2602785.04140.00861.7000 × 10−52.2101 × 10−514.71302.86750.01532.2205 × 10−16
p ˜ 3.860.00750.0000170.00002214.122.420.0152.2205 × 10−16
ω i 0.60.360.0250.110.110.420.0313.58
Table 3. Estimated parameter values for the within-host model (1) obtained by fitting to total CD4+ cell count and NVP-sensitive and NVP-resistant strains. The units of each parameter and the lower bounds used in the constraint optimization problem (12) are presented.
Table 3. Estimated parameter values for the within-host model (1) obtained by fitting to total CD4+ cell count and NVP-sensitive and NVP-resistant strains. The units of each parameter and the lower bounds used in the constraint optimization problem (12) are presented.
Parameter[lb,ub]Estimated ValueUnits
λ [ 0 , 100 ] 11.64 cells µL−1 day 1
d [ 0.01 , 1 ] 0.0144 day 1
k s [ 0 , 1 ] 0.1220 µL vRNA copies 1 day 1
k r [ 0 , 1 ] 0.0007 µL vRNA copies 1 day 1
u [ 0 , 1 ] 0.0140 dimensionless
π r [ 0 , 100000 ] 40.03 vRNA copies CD 4 + cells day 1
π s [ 0 , 100000 ] 0.1839 vRNA copies CD 4 + cells day 1
c [ 0 , 100 ] 7.47 day 1
δ [ 0.05 , 1 ] 0.4705 day 1
Table 4. Monte Carlo simulation results for the second data set. The ARE of each parameter is presented for each noise level.
Table 4. Monte Carlo simulation results for the second data set. The ARE of each parameter is presented for each noise level.
ParameterARE with σ = 1 % ARE with σ = 5 % ARE with σ = 10 % ARE with σ = 20 %
λ 0.260.971.924.26
d2.8513.4621.2731.31
k s 1.697.1214.5233. 28
k r 1.296.6213.1822.35
u5.9538.26108.38211.75
π r 0.62.64.458.12
π s 1.526.6813.226.43
c0.2412.345.62
δ 0.743.176.4413.32
Table 5. Monte Carlo simulation results with high-frequency data, one data point for each day for 140 days. The ARE of each parameter is presenting for each noise level.
Table 5. Monte Carlo simulation results with high-frequency data, one data point for each day for 140 days. The ARE of each parameter is presenting for each noise level.
ParameterARE with σ = 1 % ARE with σ = 5 % ARE with σ = 10 % ARE with σ = 20 %
λ 0.010.10.230.56
d0.21.543.698.06
k s 0.080.531.252. 99
k r 0.080.521.072.39
u0.534.249.2621.01
π r 0.020.210.521.27
π s 0.060.441.072.49
c0.010.090.20.46
δ 0.060.330.711.49
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tuncer, N.; Ghods, K.; Sreejithkumar, V.; Garbowit, A.; Zagha, M.; Martcheva, M. Validation of a Multi-Strain HIV Within-Host Model with AIDS Clinical Studies. Mathematics 2024, 12, 2583. https://doi.org/10.3390/math12162583

AMA Style

Tuncer N, Ghods K, Sreejithkumar V, Garbowit A, Zagha M, Martcheva M. Validation of a Multi-Strain HIV Within-Host Model with AIDS Clinical Studies. Mathematics. 2024; 12(16):2583. https://doi.org/10.3390/math12162583

Chicago/Turabian Style

Tuncer, Necibe, Kia Ghods, Vivek Sreejithkumar, Adin Garbowit, Mark Zagha, and Maia Martcheva. 2024. "Validation of a Multi-Strain HIV Within-Host Model with AIDS Clinical Studies" Mathematics 12, no. 16: 2583. https://doi.org/10.3390/math12162583

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop