Next Article in Journal
BDNF/TrkB Is a Crucial Regulator in the Inflammation-Mediated Odontoblastic Differentiation of Dental Pulp Stem Cells
Next Article in Special Issue
Dynamics of Fibril Collagen Remodeling by Tumor Cells: A Model of Tumor-Associated Collagen Signatures
Previous Article in Journal
Co-Aggregation and Parallel Aggregation of Specific Proteins in Major Mental Illness
Previous Article in Special Issue
Investigating Two Modes of Cancer-Associated Antigen Heterogeneity in an Agent-Based Model of Chimeric Antigen Receptor T-Cell Therapy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Brief Report

Mathematical Modeling of Clonal Interference by Density-Dependent Selection in Heterogeneous Cancer Cell Lines

1
Moffitt Cancer Center, Integrated Mathematical Oncology, USF Magnolia Drive, Tampa, FL 33612, USA
2
Department of Cell Biology, Microbiology, and Molecular Biology, University of South Florida, 4202 E Fowler Ave, Tampa, FL 33612, USA
3
Department of Computer Science, Najran University, King Abdulaziz Road, Najran 61441, Saudi Arabia
4
Moffitt Cancer Center, Analytic Microscopy Core, USF Magnolia Drive, Tampa, FL 33612, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Cells 2023, 12(14), 1849; https://doi.org/10.3390/cells12141849
Submission received: 2 May 2023 / Revised: 30 June 2023 / Accepted: 8 July 2023 / Published: 14 July 2023

Abstract

:
Many cancer cell lines are aneuploid and heterogeneous, with multiple karyotypes co-existing within the same cell line. Karyotype heterogeneity has been shown to manifest phenotypically, thus affecting how cells respond to drugs or to minor differences in culture media. Knowing how to interpret karyotype heterogeneity phenotypically would give insights into cellular phenotypes before they unfold temporally. Here, we re-analyzed single cell RNA (scRNA) and scDNA sequencing data from eight stomach cancer cell lines by placing gene expression programs into a phenotypic context. Using live cell imaging, we quantified differences in the growth rate and contact inhibition between the eight cell lines and used these differences to prioritize the transcriptomic biomarkers of the growth rate and carrying capacity. Using these biomarkers, we found significant differences in the predicted growth rate or carrying capacity between multiple karyotypes detected within the same cell line. We used these predictions to simulate how the clonal composition of a cell line would change depending on density conditions during in-vitro experiments. Once validated, these models can aid in the design of experiments that steer evolution with density-dependent selection.

1. Introduction

Cellular heterogeneity is a defining feature of most cancers, and it is critical to tumor progression and treatment failure [1,2]. Advances in sequencing techniques have provided for an unprecedented depth of genetic profiling and ushered in a new era of individualized, data-driven cancer genetics [3]. Despite this progress, the relationship between genetic and phenotypic heterogeneity remains a significant gap in our current understanding.
One contributor to a cancer cell’s phenotype includes large-scale somatic copy number alterations (SCNAs) of 10 mega base pairs or more. Studies show that SCNAs correlate with progression rates and overall survival, with cancer cells grouped by copy number landscape exhibiting the same resistance to chemotherapy [4,5,6,7]. The SCNAs of whole chromosomes or chromosome arms, also known as aneuploidy, are a defining feature of many cancers [8]. Chromosomal instability (CIN) is a hallmark of cancer [9] and accounts for the vast majority of genetic material with an altered copy number state. CIN has been shown to promote metastasis and tumor evolution, particularly in cancers which have common aneuploidy patterns, such as gastric cancer [2,10]. Studies have shown that SCNAs activate oncogenes, disrupt tumor suppressor genes [11], correlate with cancer phenotype [12,13,14,15], and can be spatially segregated within the tumor [16]. Aneuploidy fuels rapid phenotypic evolution and drug resistance, with similar karyotype profiles displaying resistance to the same drug [17]. The correlation between karyotypic and phenotypic divergence is not surprising: two cells with different karyotypes will differ in the expression of thousands of genes, which in turn will have broad phenotypic effects. However, how selection acts upon karyotypes remains poorly understood.
Here, we aim to characterize a targeted subset of the phenotypic differences between co-existing karyotypes, which are defined by their ability to out-compete each other at low vs. high cell densities. Cell densities naturally vary in the tumor over space and time, thus creating niches with distinct selection pressures. A cell that is in a densely packed environment will be under more pressure to overcome contact inhibition than a cell that finds itself in sparse conditions. This variation in evolutionary pressures likely contributes to the coexistence of cancer cells with heterogeneous phenotypes. Density-dependent selection occurs when fitness is a function of population density [18]. A related concept in ecology is life history theory [19] and the r/K selection framework, which investigates trade-offs between the number of offspring a species produces (growth rate or ‘r’) and the ability to compete in dense ecological niches (carrying capacity or ‘K’) [20,21].
We present a framework for examining how density-dependent selection acts on coexisting clones defined by karyotypes and apply it to a set of stomach cancer cell lines. Hereby, we focus on carrying capacity as defined by spatial limitations rather than metabolic constraints.

2. Results

2.1. Identifying Biomarkers of Growth Rate and Carrying Capacity

By sequencing the DNA and RNA of >36,000 cells from nine stomach cancer cell lines, our prior work classified cells into groups with unique karyotypes [22], which are further referred to as superclones [23] or clones. To compare the growth dynamics across cell lines, we grew eight of these stomach cancer cell lines in a T25 flask until they reached confluence (6–23 days); we then imaged them every day to count cells (Section 4).
We changed the culture media every 3.05 days on average, with media changes becoming more frequent as cells became more confluent. We compared cell counts derived from a Countess (Life Technologies Countess II FL Automated Cell Counter) to cell counts derived from live cell imaging to confirm segmentation accuracy (Appendix A Figure A6), thus supporting feasibility of monitoring growth dynamics during routine in-vitro experiments.
We fit the Gompertz growth model and two instances of the generalized logistic growth model (Richards and Verhulst) [24] to this time-series data (Figure 1A; see also Section 4.4). The resulting R 2 was high (Adj- R 2 > 0.95 ) for each of the models and cell lines (Appendix A Table A1). Using the Akaike Information Criterion, we determined that the Richards model slightly outperformed the Verhulst model, which in turn slightly outperformed the Gompertz model (Appendix A Figure A3). We thus eliminated the Gompertz model from consideration for model selection. Using likelihood profiling [25] to assess practical identifiability, we concluded that the noise levels in the data collected for 4/8 cell lines were too high to infer all the parameters of the Richards model (Appendix A Figure A4). Therefore, we used the simpler Verhulst model (commonly referred to as the “logistic function”) to infer growth dynamics from time-series data for all eight cell lines (see Figure 1, Appendix A Table A2).
Overall, there was no significant correlation between the growth rate and the carrying capacity across cell lines (Pearson’s r = 0.37 , p = 0.54 ). In order to investigate the potential r/K trade-offs between clones within a cell line, we sought to find transcriptomic biomarkers of the inferred growth parameters in the scRNA-seq data available from our prior study [22]. Recently, investigators induced r/K-selection in HeLa cells [20]. The authors found that genes that were differentially expressed between r- and K-selected cells cultured at low densities were enriched among 25 pathways defined in the KEGG database [26]. We tested these pathways for their potential as biomarkers of the growth rate and carrying capacity in five of the eight gastric cancer cell lines (further referred to as the training set; Appendix A Figure A2). For each of the 25 pathways, we fitted a linear regression model to predict the growth parameters inferred for a given cell from the cell’s respective pathway activity level. Because the cell cycle is one of the strongest modulators of pathway activity [22,27,28], we also grouped the cells according to their assigned cell cycle state, thereby calculating the median pathway activity across the following: (i) G0G1 cells, (ii) S cells, (iii) G2M cells, and (iv) all cell cycle states combined. The pathways that had the strongest predictive value for the growth rate (r) included the ‘Amoebiasis during G2M’ (adj- R 2 = 0.89 , p < 0.02 ), ‘Epstein–Barr virus infection’, and ‘AMPK signaling pathway during S-phase’. For the carrying capacity (K), the ‘PI3K-Akt signaling pathway’ (adj- R 2 = 0.91 , p < 0.01 ) and ‘Arginine and proline metabolism’ had the strongest signals (Appendix A Figure A2). This process was repeated to include the remaining three cell lines (further referred to as the validation set), but only for these top performing pathways and cell cycle states (Figure 1B). Of these top pathways, 40% were confirmed in the validation set (FDR adjusted p < 0.05 ; Figure 1B), including ‘Amoebiasis during G2M’ and ‘Arginine and proline metabolism’, which were further used as proxies of the growth rate (r) and carrying capacity (K), respectively.

2.2. Towards Informing Future Experiments: Steering Clonal Evolution

The identification and characterization of co-existing clones within a cell line requires high-throughput assays, such as single-cell sequencing. Repeated measurements of a cell line’s clonal compositions at a high temporal resolution are thus cost-prohibitive. This underscores the need to identify biomarkers that can be used to estimate clonal growth parameters.
We previously identified 39 clones, which were each defined by distinct SCNAs and confirmed by both scDNA- and scRNA-seq data across the eight cell lines [22] (Figure 1E). Using pathway activity levels from the KEGG pathways ’Amoebiasis’ and in ’Arginine and proline metaboism’, we predicted the growth rate and carrying capacity for every sequenced cell in each cell line. By grouping cells according to their clone membership (Figure 1F,G) we concluded that up to two r/K trade-offs may exist between clone pairs in two of the eight analyzed cell lines (Online Methods and Appendix A Table A3), namely in NCI-N87 and SNU-668 (Figure 2A).
While the growth conditions for SNU-668 can be modulated to favor the r-selected clone, by the time it exceeds the K-selected clone, its frequency in the cell line is less than 0.001%. In this case, the r/K trade-off between the two clones has little practical relevance, because other clones in the cell line take over the population faster than the two clones can outcompete each other (Figure 2B). In contrast, the r/K trade-off identified in NCI-N87 did have practical relevance, with most cell culture schedules resulting in populations where at least one of the two clones maintains high cell representation (>5%; Figure 2C).
Optimizing the seeding density and the timing for splitting cells should thus enable evolutionary steering of the cell line’s clonal composition across passages. A condition for such an optimal time to exist is that growth of the population can never be negative, which we prove analytically (Appendix A.2). We used the growth parameters predicted for the clones identified in the NCI-N87 cell line (Appendix A Table A3) to parameterize a multi-compartment ODE model to simulate how clonal composition changes over time (Figure 2B):
d N i d t = N i r i 1 i 1 n N i K i ,
where N i is the cell representation of clone i. K i and r i are the biomarker-inferred carrying capacity and maximum growth rate of each clone, respectively (see also Section 4.4).
The simulations shown in Figure 2D predict that shifts in clonal composition will likely occur as NCI-N87 cells grow in-vitro, with the magnitude of these shifts depending on the seeding density and the timing of splitting cells (Figure 2C–E). For example, with a low seeding density and a splitting interval of 7 days, the r-selected clone outcompeted the K-selected clone after only four passages (Figure 2E). By contrast, a high seeding density and a splitting interval of only 1 day maintained a relative dominance of the K-selected clone over >30 passages (Figure 2E). This suggests that, for a subset of cell lines, density conditions during in-vitro experiments can be optimized to either accelerate or delay changes in clonal composition.

3. Discussion

Our work rests on the shoulders of Uri Ben-David et al.’s landmark paper [29], which examined the responsiveness of 21 different variants of the same cell line with 321 anti-cancer agents. They found that 75% of the tested compounds that strongly inhibited some variants were inactive in others and that copy number changes (which are dominated by differences in karyotype) explained most of this differential phenotypic response. Two follow-up questions their work raises are: “what is the mechanism of clonal interference for a specific culture condition?” and “can we predict the clonal composition that will emerge from a given culture condition?”. Our work contributes to begin answering these multi-faceted questions. Through a re-analysis of existing scRNA-seq data from eight stomach cancer cell lines [22], we confirmed the biomarkers of growth rate and contact inhibition as previously identified in breast cancer cell lines [20], thus suggesting similarities in the mechanisms of contact inhibition exist across multiple tissue and cancer types. Differential expression of biomarkers for growth rate and contact inhibition/carrying capacity between clones within a cell line (Figure 1) suggest that different population densities will alter a cell line’s clonal composition. However, the presence of significant r/K trade-offs within cell lines was limited to only two clone pairs (Figure 2A). Most experimental protocols will not see the cells spend a long time at high density and will thus tend to select for cells with higher growth rates in the exponential phase, thus potentially explaining relative lack of significant r/K trade-offs within cell lines. Despite that, the predicted intra-cell line variability in both carrying capacity and maximum growth rate (Figure 1F,G) heralds ongoing changes in cellular composition, which are in line with changes in clonal evolution we observed previously over only 5–7 passages in a sub-set of these cell lines [22].
Our study focuses on a naturally fluctuating feature of the cellular microenvironment: population density. We therefore define phenotypic differences between karyotypes as their ability to out-compete each other at low vs. high population densities. However, we do not expect these traits to be the only phenotypic difference between the identified karyotypes. The expression magnitude of thousands of genes distinguishes any two karyotypes, and, with them, we expected multiple phenotypes to also differ. However, in the absence of any other differences in growth conditions (such as drug exposure), most phenotypic differences between karyotypes will be latent [29,30], thus allowing us to observe their relatively subtle differential growth under variable density conditions. We consider these differences to be only the tip of the iceberg among all phenotypic differences between co-existing karyotypes. However, the realization that even the most basic cell culture habits (seeding density and passaging frequency) can influence long-term cellular dynamics, we expect, will prompt the scientific community to prioritize developing databases that routinely record detailed cell culture protocols.
Clones that are far from each other in r–K space may also be spatially segregated and have differential treatment sensitivities. For example, the slower growth rate of K-selected cells may render them more resistant to cytotoxic therapies [31]—an effect that should be amplified by their spatial segregation [32,33]. We would expect K-selected sub-populations to be closer to the center of the tumor mass, where they are also more protected from drug penetration. By contrast, the enhancing tumor should be dominated by r-selected cells [31]. This scenario opens the door to an emerging concept of cancer treatment called adaptive therapy [34]. Adaptive regimens aim to optimize doses and dose schedules to maintain a fixed population of therapy-sensitive cells such that these, in turn, can suppress the growth of resistant cells [34]. Biomarkers that quantify the representation of r- and K-selected sub-populations could thus contribute to the design of adaptive regimens that spare a narrow border of r-selected cells such that the K-selected population stays enclosed [31]. The growing field of spatial transcriptomics makes it possible to measure the expression levels of thousands of genes throughout tissue space [35]. The ability to predict a cell’s placement along the r–K phenotype continuum from its transcriptome could thus facilitate the spatial de-lineation of r- from K-selected sub-populations in primary tumor tissue sections. Testing these ideas would require expanding our models into a spatial in-vivo framework. However, an understanding of the spatial distribution of r- and K-selected sub-populations could facilitate the design of therapy regimens, such as adaptive therapy or double bind therapy [36], aiming to stabilize the overall tumor burden to prevent or at least delay therapy resistance.
Our in silico results add to the evidence accumulating from many other studies [22,29,37] that every cell division is an opportunity for the cells to mutate and adapt to their environment. This insight underscores the need for routine tracking of the pedigree of evolving cell populations over decades, along with potentially changing environments (e.g., cell culture habits and therapy). Such efforts could help reveal long-term trends in the evolution of cell lines that remain elusive at shorter time-scales.

4. Online Methods

4.1. Cell Culture

The identity of each cell line was determined through independent karyotyping and mycoplasma contamination assessment. Cells were cultured in their recommended media conditions at 37°C. For, HGC-27, EMEM (Quality Biological Inc., Gaithersburg, MD, USA); KATOIII, RPMI-1640, and EMEM (1:1 mix) were used; for NCI-N87, RPMI-1640 (ATCC modified) was used; the remaining five cell lines (MKN-45, NUGC-4, SNU-601, SNU-638, and SNU-668) used RPMI-1640. All cell lines were grown in the aforementioned media with 10% fetal bovine serum (Gibco, Carlsbad, CA, USA) and 1% penicillin—streptomycin (Gibco). For cell passaging, Trypsin EDTA (0.25%) with phenol red (Gibco) was added, followed by inactivation using the respective growth media. Cells were seeded at a low initial density (range: 4.98 38.64 10 3 cells/cm2) in order to allow for the population’s exponential growth and subsequent plateau at the carrying capacity (range: 1.92 15.32 10 5 cells/cm2). The end-point for each experiment was determined by fitting a generalized logistic growth curve to the in-vitro data at various points during the time in culture. Across 8 cell lines, the time in culture ranged from 6–21 days, with media being refreshed on average every 48 h until the cells reached confluence, whereupon media was refreshed every 24 h. The experiment was stopped once the fold change in the inferred carrying capacity was less than 4% when removing any subset of the last 20% of time points.

4.2. Microscopy

Cells were seeded in a T25 flask (Fisherbrand), and sets of 4 phase contrast images were taken at 20× magnification for NCI-N87, and 10× magnification for the aforementioned cell lines on an Evos FL. Segmentation using Cellpose was utilized to quantify the cell count and cellular features (number of detections, centroid X μ m, centroid Y μ m, ROI, area m2, and perimeter μm). Growth curves were generated to examine the carrying capability of each cell line.
Image preprocessing: The microscopy image acquisition and light settings resulted in variations in the brightness of acquired images. The automatic segmentation and counting pipeline applied to dark images showed inaccurate results (i.e, mostly false negatives). Therefore, we applied pre-processing steps to our pipeline, which consisted of the following: (i) gamma correction to reduce darkness; (ii) histogram equalization to improve image contrast, and (iii) Gaussian blurring for smoothing.TV yes it should
Cell segmentation: Fully automated cell segmentation of phase contrast images was performed using Cellpose software.
Cellpose is a deep learning approach based on the U-Net architecture for cell segmentation [38,39], where vertical and horizontal spatial gradients of cells are predicted. Furthermore, Cellpose predicts a binary map of a cell location either inside or outside a region of interest (ROI). Using both the combined vertical and horizontal gradients and the binary map, the cell localization and generation of binary masks for every cell are performed. Our pipeline for cell segmentation using Cellpose used 2D microscopic images as input. An ROI annotation rectangle was applied to the phase contrast image (left corner index of the rectangle: (100,100); rectangle width and height: (1800 px, 1100 px). This ROI was used for subsequent image segmentation and analysis. We fine-tuned a pre-trained Cellpose model called (Cytotorch_2) for learning to segment cells on given microscopic images of different cell lines. After learning the segmentation using Cellpose, the trained model was tested on a hold-out set for evaluation. Then, feature extraction of each segmented cell was performed for further analysis of cell growth at a given time point. The features extracted from each cell segmentation included area, perimeter, roundness, and centroid. The pipeline saved the extracted features and visualization of the segmentation onto a user-desired folder for cell growth estimation and analysis.

4.3. Determination of Clonal Growth Parameters

Determining relative clonal carrying capacities, growth rates, and loss of contact inhibition parameter values is a necessary first step in modeling density-dependent selection in heterogeneous cell lines. To achieve this, we used gene expression signatures as surrogates of growth parameters as previously described [40,41].
Quantification of single cell pathway activity from gene expression:
In order to identify cells with active gene sets, we utilized AUCell version 3.18 for R version 4.3.1 [42]. AUCell takes as its input the scRNA sequencing data for the cells of interest and a list of gene sets. The output is the gene set activity for each cell. AUCell uses the area under the curve across the rankings of all genes of a particular cell, where genes are ranked by their expression value. A rank-based scoring method means AUCell is not affected by units or normalization methods of the gene expression data. Genes in the top 5% of the ranking are considered active in a given gene set. In order to account for potential batch effects, scRNA sequencing data for all cell lines were analyzed together. Cells with less than 200 features were excluded. Seurat version 2.3.4 was used to create a Seurat object for input to AUCell to quantify the activity of more than 2000 pathways from the KEGG database. For all further analysis, we focused on only a subset of 25 KEGG pathways, which were previously identified as being differentially expressed between r-selected and K-selected HeLa cells [20].
Clonal growth parameters:
We tested these 25 KEGG pathways quantified by AUCell as biomarkers of growth rate (r) and carrying capacity (K). The test takes the form of a linear model:
δ a p + c ,
where δ is the parameter value (r or K), and p is the pathway activity. The pathways are then ranked by adjusted R 2 , and the top five best-fitting pathways are prioritized as potential biomarkers of a given parameter.
With the linear models correlating pathway activity with growth parameters built at the cell line level, we can predict growth parameters for all clonal populations using their pathway activity as input (Figure 1E,F, Appendix A Table A3). Once parameters have been predicted, clonal growth can be simulated as systems of paired ODEs (see Section 4.4).

4.4. Mathematical Models of In-Vitro Cell Growth

The nature of tumor growth is not well known, and the exact laws which govern the growth of tumor cells will likely be context-dependent (cancer type, location in the body, etc.). However, even when many of these dependencies are held constant, it is difficult to discern between various models of cancer cell growth [43,44,45,46]. We fit generalized logistic growth models [24] to the time series cell growth data for eight gastric cancer cell lines (Appendix A Table A1):
d N i d t = N i r i 1 i 1 n N i K i v ,
where i is either the entire cell line population (i.e., n = 1 ) or one of multiple clones within a cell line (i.e., n 3 ). K i and r i are the carrying capacity and maximum growth rate of each population, respectively. v is the loss of contact inhibition, which is assumed to be identical for all clones within a cell line. Two instances of Equation (3) were fitted to the data: one representing Richards growth model where v was kept variable and the other representing the logistic growth model with v : = 1 .
In addition to the two models mentioned above, we also fitted cell line population growth with the Gompertz model [24]. These models were chosen due to their prevalence in the literature for describing the growth of tumors and their biological interpretation. The growth data was read into R and fit using the package growthrates [47]. In order to compare the quality of growth-model fits, we calculated the adjusted- R 2 for each of the three models across all eight cell lines (Appendix A Table A1) We then compared Akaike Information Criterion scores (Appendix A Figure A3) and found the Richards model to have the lowest score in 7/8 cell lines. However, identifiability analysis revealed that we were unable to confidently infer the growth rate (r) and/or loss of contact inhibition (v) parameters in 4/8 cell lines (Appendix A Figure A4 and Figure A5). Thus, we modeled in-vitro growth using the logistic growth model. To model clonal growth dynamics, we used the ODE45 solver for MatLab version 9.7.0.1216025 to solve Equation (3) after being parameterized using the values in Appendix A Table A3.

4.5. Identifying r/K Trade-Offs between Co-Existing Clones within a Cell Line

Let C = { N 1 , N n } be the set of n clones identified within a given cell line. For each pair { x , y | x , y C x y } we use a Student’s t-test to compare the growth rates predicted for cells assigned to clone x ( r x ) vs. cells assigned to clone y ( r y ). We do the same for the carrying capacity predicted for cells of the two clones ( K x , K y ). For clone pairs with a p-value 0.1 for both r and K, we further calculate τ r ( x , y ) = r x ¯ r y ¯ and τ k ( x , y ) = K x ¯ K y ¯ . We define (x*,y*) as clones with potential r/K trade-offs:
( x , y ) = { x , y | ( τ r ( x , y ) < 1 τ k ( x , y ) > 1 ) ( τ r ( x , y ) > 1 τ k ( x , y ) < 1 ) } .

Author Contributions

Conceptualization, T.V. and N.A.; methodology, T.V., S.A., A.S., J.J. and N.A.; software, S.A. and N.A.; validation, T.V., S.A., R.B. and N.A.; formal analysis, T.V., S.A. and N.A.; investigation, T.V. and N.A.; resources, N.A.; data curation, T.V., S.A., A.S. and N.A.; writing—original draft preparation, T.V., S.A. and N.A.; writing—review and editing, T.V. and N.A.; visualization, T.V., S.A. and N.A.; supervision, N.A.; project administration, N.A.; funding acquisition, N.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Institute of Health at the National Cancer Institute (1R37CA266727-01A1 to N.A.) and by the Moffitt Cancer Center Evolutionary Therapy Center of Excellence (CET 30-20458-03-20 to N.A.). The funders had no role in the study design, data collection and analysis, decision to publish, or the preparation of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Supporting data for the reported results, as well as the code that generates the main figures in this manuscript, can be found at https://www.github.com/MathOnco/densityDependentSelection.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Identifiability Analysis

Identifiability analysis is a group of methods used to determine how well the parameters of a model are estimated [25]. The likelihood function is one method that can be used to identify parameter values that are more likely than others. In other words, it tells us which parameter values could be expected to produce the data to which we are fitting the model. Profiling likelihoods is achieved by constructing a function that minimizes the negative log likelihood for a fixed value of a parameter of interest, and it returns z, which is the signed square-root deviance from the minimum. At the maximum likelihood estimation, z = 0 by definition.
Let the vector θ be the group of parameters used in Equation (3) to make predictions. Then, the likelihood function can be written as L ( θ | x ) = f θ ( x ) . Suppose that θ can be decomposed as θ = ( δ , ξ ) , where δ is the parameter of interest, and ξ is a vector containing the nuisance parameters. We can re-write our likelihood function as L ( δ , ξ | x ) = f δ , ξ ( x ) . By profiling, we are concentrating the likelihood function for a subset of parameters by expressing the nuisance parameters as a function of the parameter of interest and replacing the nuisance parameter in the likelihood function. Thus, the profile likelihood is calculated as follows:
L p ( δ ) = s u p ξ L ( δ , ξ , x ) .
Let ϵ be the error between observations and model fit, as defined above, and let the growth rate (r) be our parameter of interest so that δ : = r and ξ : = ( K , v ) . Then, in each direction away from a r g m a x δ ( L p ( δ ) ) , we fix δ by ± s . d . ( ϵ ) , and we adjust ξ such that z is minimized. This process continues until z stops changing for new values of δ or is equal to the MLE fit error. Then, z approximates a chi-squared distribution on which we can compute the confidence intervals. We define parameters with confidence intervals that have a >2-fold difference between the 2.5% and 97.5% quantiles as unidentifiable. This analysis revealed that the growth rate (r) or loss of contact inhibition (v) parameters were unidentifiable in the Richards model in 4/8 cases (Appendix A Figure A4, Appendix A Figure A5). Thus, we excluded the Richards model and selected the logistic model, given its better fits and lower AIC scores compared to the Gompertz model (Appendix A Table A1 and Figure A3).

Appendix A.2. Analytical Results Optimizing Cell Passaging for Logistic Growth Model (Two Clone Case)

We identified two r/K trade-offs between clone pairs across eight cell lines (Figure 2A, Appendix A Table A3). This suggests that, for these cell lines, an optimal time for splitting the cells exists where the trade-off is balanced. Optimizing the timing for splitting the cells should thus stabilize a cell line’s clonal composition over multiple passages. A condition for such an optimal time to exist is that the growth of the population can never be negative. Here, we prove that a heterogeneous cell line with a positive growth rate, consisting of two clones with an r/K trade-off, will never have a negative growth rate.
Let i { 1 n } be a clone within a cell line consisting of n clones. Let each clone have its own specific growth rate, r i , and carrying capacity, K i . Then, the change in the number of cell members, N i , of clone i can be modeled as follows:
d N i d t = ( N i · r i ) · ( 1 j = 1 n N j K i ) ,
which has the following solution:
N i ( t ) = N i · K i j = 1 n N j + ( K i j = 1 n N j ) · e ( r i · t ) .
In order to arrive at the solution above in (A3), we start by separating the variables in (A2), thereby obtaining our d t on the LHS and the d N i on the RHS.
( r i ) ( d t ) = d N i N i · ( 1 j = 1 n N j K i ) .
Next, we expand the RHS of (A4) through partial fraction decomposition, where Γ and Ψ are unknown constants.
1 N i · ( 1 j = 1 n N j K i ) = Γ N i + Ψ ( 1 j = 1 n N j K i ) .
We multiply both sides of (A5) by the LHS denominator to simplify, giving us
1 = Γ · [ 1 j = 1 n N j K i ] + ( Ψ · N i ) .
Then, by solving for Ψ and Γ we obtain the following:
Ψ = 1 N i j = 1 n N j N i + Γ · j = 1 n N j N i · K i Γ = 1 .
Now, we can return to integrating the following:
r i d t = d N i N i + 1 N i j = 1 n N j N i + Γ · j = 1 n N j N i · K i K i 1 j = 1 n N j K i .
The LHS, and first term of RHS are easy enough to solve. For the second term of the RHS, we need to integrate through u substitution, where u = 1 j = 1 n N j K i and d u = 1 K i . We then obtain the following:
d u u = l n [ u ] = l n [ 1 j = 1 n N j K i ] .
Then, by putting all the terms together and accounting for the constant of integration C, we obtain the following:
( r i · t ) + C = ln [ N i 1 j = 1 n N j K i ] .
Next, we exponentiate both sides:
C · e ( r i · t ) = N i 1 j = 1 n N j K i .
We find how C = e C relates to the initial conditions:
C = N i · K i K i j = 1 n N j .
We plug C into (A11) and solve for N:
N i ( t ) = N i · K i j = 1 n N j + ( K i j = 1 n N j ) · e ( r i · t ) .
Now, we can move from population sizes to frequencies by letting p = N i j = 1 n ( N j ) and T = i = 1 n ( N i ) , where p is the proportion of one of our subclones N i , and T is the total population. Thus, the equations that describe how p and T change over time, in the two clone cases, are as follows:
d p d t = p · ( 1 p ) [ r a · ( 1 T K a ) r b · ( 1 T K b ) ]
d T d t = T [ ( p r a ) ( 1 T K a ) + ( 1 p ) ( r b ) ( 1 T K b ) ] ,
where K a and K b are the carrying capacity of clone (a) and (b), respectively, and r a and r b are the respective growth rates.
Next, we sought to prove that Equation (A15) is always positive. To do so, we started out by assuming the opposite—that there does exist some time point where T < 0 —and arrived at the proof by contradiction. Please note that we are most interested in the situation where there is an r/K trade-off. That is, for the rest of this exercise, please assume r a > r b and K a < K b . Let us also assume that our two populations are well-mixed and initially comprise no more than 60 percent of m a x ( K a , K b ) .
If we let
d T d t = T [ ( p r a ) ( 1 T K a ) + ( 1 p ) ( r b ) ( 1 T K b ) ] = 0 ,
and solve for T, we obtain the following:
T = ( K a K b ) [ ( r b p ) r b ( r a p ) ] ( K a r b p ) ( K a r b ) ( K b r a p ) .
Equation (A17) is the T nullcline. It is easy to see that this function, T ( p ) , is one-to-one. If T ( p 1 ) = T ( p 2 ) , then p 1 = p 2 . The nullcline being a one-to-one function implies that every input that can bring the system to zero will only bring the system to zero. Now, we have T totally in terms of p, r a , r b , K a , and K b . Next, we can plug it into Equation (A14) to obtain it defined in terms of p and the aforementioned r and K parameters:
d p d t = { p ( 1 p ) [ r a ( 1 K b [ ( r b p ) r b ( r a p ) ] ( K a r b p ) ( K a r b ) ( K b r a p ) ) r b ( 1 K a [ ( r b p ) r b ( r a p ) ] ( K a r b p ) ( K a r b ) ( K b r a p ) ) } ] ,
which is always less than or equal to zero. Thus, the system can never cross the nullcline given by T = 0 . We know that our system exists in the T > 0 space due to the initial conditions and positive r and K terms, but it can never cross into T < 0 space. This contradicts our original assumption that there existed a timepoint t such that T ( t ) < 0 . Hence T is always positive. The vector field given by T and p with the nullcine T = 0 (Appendix A Figure A1) shows that the T nullcline passes the Horizontal Line Test, thus further demonstrating its injective nature. For visualization purposes, the total population, T (the x-axis), has been scaled to m a x ( K a , K b ) = K b = 1 . Accordingly, each of our variables, p and T, are scaled by p = p K b and T = T K b .
Figure A1. Growth dynamics for the two clone case. Vector field given by Equations (A17) and (A18). The red curve shows where there is no change in total population (given by the nullcline T = 0 ). Arrows indicate total population growth ( T ). Velocities calculated as T ( t ) 2 + A ( t ) 2 . Velocity is always zero at the nullcline, thus showing that the system can never cross from T > 0 into T < 0 space.
Figure A1. Growth dynamics for the two clone case. Vector field given by Equations (A17) and (A18). The red curve shows where there is no change in total population (given by the nullcline T = 0 ). Arrows indicate total population growth ( T ). Velocities calculated as T ( t ) 2 + A ( t ) 2 . Velocity is always zero at the nullcline, thus showing that the system can never cross from T > 0 into T < 0 space.
Cells 12 01849 g0a1
Table A1. Adjusted- R 2 for model fits across cell lines.
Table A1. Adjusted- R 2 for model fits across cell lines.
HGC-27KATOIIIMKN-45NCI-N87NUGC-4SNU-601SNU-638SNU-668
Logistic1.000.970.990.990.980.990.970.97
Richards1.000.970.990.990.970.990.970.98
Gompertz1.000.950.960.950.980.980.950.98
Table A2. Inferred parameters for best logistic models across cell lines.
Table A2. Inferred parameters for best logistic models across cell lines.
rKbestFitadjutedRSquareddoublingTimecellLine
HGC-270.965,095,556.00Logistic1.001.10HGC-27
KATOIII0.444,904,316.00Logistic0.974.70KATOIII
MKN-450.639,237,232.00Logistic0.993.00MKN-45
NCI-N870.4538,306,195.00Logistic0.992.50NCI-N87
NUGC-40.926,899,073.00Logistic0.981.10NUGC-4
SNU-6010.517,374,909.00Logistic0.992.30SNU-601
SNU-6380.714,791,667.00Logistic0.971.20SNU-638
SNU-6680.455,048,178.00Logistic0.981.90SNU-668
Figure A2. Statistics evaluating the transcriptional activity of 25 KEGG pathways as biomarkers of growth rate (columns with prefix ‘r’) and carrying capacity (columns with prefix ‘K’): The 25 pathways were chosen because they were identified as differentially expressed between r/K-selected HeLa cells under low-density culture conditions [20]. For each pathway, a linear regression model was fitted to predict the respective growth parameter (r or K) from pathway activity level measured in the same cell line (columns with suffix ‘r.squared’ and ‘p.value’). Analysis was performed across five gastric cancer cell lines used as training dataset (HGC-27, KATOIII, NCI-N87, NUGC-4, and SNU-668). The top 5 pathways’ predictive growth rates and top five pathways’ predictive carrying capacities (names in bold) were prioritized during training and further used in Figure 1B for validation with an extended dataset. p-values are color-coded depending on whether the respective pathway and growth parameter were positively (purple) or negatively (blue) correlated in the training dataset.
Figure A2. Statistics evaluating the transcriptional activity of 25 KEGG pathways as biomarkers of growth rate (columns with prefix ‘r’) and carrying capacity (columns with prefix ‘K’): The 25 pathways were chosen because they were identified as differentially expressed between r/K-selected HeLa cells under low-density culture conditions [20]. For each pathway, a linear regression model was fitted to predict the respective growth parameter (r or K) from pathway activity level measured in the same cell line (columns with suffix ‘r.squared’ and ‘p.value’). Analysis was performed across five gastric cancer cell lines used as training dataset (HGC-27, KATOIII, NCI-N87, NUGC-4, and SNU-668). The top 5 pathways’ predictive growth rates and top five pathways’ predictive carrying capacities (names in bold) were prioritized during training and further used in Figure 1B for validation with an extended dataset. p-values are color-coded depending on whether the respective pathway and growth parameter were positively (purple) or negatively (blue) correlated in the training dataset.
Cells 12 01849 g0a2aCells 12 01849 g0a2b
Table A3. Predicted growth parameters for clones (rows) identified in eight gastric cancer lines. Parameter predictions come from linear models built between pathway biomarkers and inferred parameter values at the cell line level. The input is clonal pathway activity, and the output is the predicted growth parameter for that clonal population.
Table A3. Predicted growth parameters for clones (rows) identified in eight gastric cancer lines. Parameter predictions come from linear models built between pathway biomarkers and inferred parameter values at the cell line level. The input is clonal pathway activity, and the output is the predicted growth parameter for that clonal population.
cloneIDKrCL
1029444,690,1050.49SNU-668
1029454,674,2500.38SNU-668
1029483,450,7770.53SNU-668
1029504,331,2580.66SNU-668
1029515,488,0430.53SNU-668
1029524,379,298NASNU-668
1029545,046,4080.62SNU-668
1029554,773,2300.52SNU-668
1063943,166,6680.56KATOIII
1063963,117,464NAKATOIII
1063993,274,3420.53KATOIII
1064043,346,5020.58KATOIII
1123807,448,8930.32SNU-601
1123828,148,1760.34SNU-601
1123877,231,439NASNU-601
1123896,627,6400.32SNU-601
1123926,798,1040.34SNU-601
1123996,312,316NASNU-601
1124028,050,4880.38SNU-601
1124047,081,5460.31SNU-601
1124088,255,7910.37SNU-601
1124107,903,615NASNU-601
1124137,196,3300.32SNU-601
11452511,055,8140.65MKN-45
11453010,785,4520.68MKN-45
11996318,156,049NANCI-N87
11996522,002,5560.62NCI-N87
11996720,354,0700.57NCI-N87
11996818,548,3050.67NCI-N87
1223605,156,5220.80SNU-638
1223614,331,370NASNU-638
1223634,759,1090.80SNU-638
1256169,123,0260.78NUGC-4
1256189,655,3950.69NUGC-4
1256198,845,4600.82NUGC-4
1293439,429,2350.85HGC-27
1293448,996,5280.82HGC-27
1293458,856,6990.76HGC-27
1293468,616,3620.78HGC-27
Figure A3. Difference in AIC scores across cell lines. Different AIC scores were calculated using the dAIC function from the package bbmle for the statistical programming language R. For 7/8 cell lines, the Richards model had the lowest AIC score. Thus, the added parsimony could be justified by the increase in the goodness of fit.
Figure A3. Difference in AIC scores across cell lines. Different AIC scores were calculated using the dAIC function from the package bbmle for the statistical programming language R. For 7/8 cell lines, the Richards model had the lowest AIC score. Thus, the added parsimony could be justified by the increase in the goodness of fit.
Cells 12 01849 g0a3
Figure A4. Likelihood profile confidence intervals across paramters and model types. Confidence intervals generated from likelihood profiles for all parameters in Richards and logistic (Verhulst) growth models. Entries with >2-fold difference between the 2.5% and 97.5% quantiles highlighted in red. Growth rate (r) is not identifiable for NUGC-4 or SNU-668 in the Richards model. Additionally, loss of contact inhibition (v) is not identifiable for those two cell lines: NCI-N87, and SNU-638.
Figure A4. Likelihood profile confidence intervals across paramters and model types. Confidence intervals generated from likelihood profiles for all parameters in Richards and logistic (Verhulst) growth models. Entries with >2-fold difference between the 2.5% and 97.5% quantiles highlighted in red. Growth rate (r) is not identifiable for NUGC-4 or SNU-668 in the Richards model. Additionally, loss of contact inhibition (v) is not identifiable for those two cell lines: NCI-N87, and SNU-638.
Cells 12 01849 g0a4
Figure A5. Likelihood profiles across cell lines. Graphical representation of the likelihood profiles for both parameters in the logistic growth model for all cell lines. Please note here that mumax is the growth rate (r). Maximum likelihood values are at the center of the x-axis, with the y-axis giving the absolute value of the z statistic from the chi-square distribution these profiles approximate. Confidence intervals given at various percentages. Tight confidence intervals tell us when parameters are identifiable.
Figure A5. Likelihood profiles across cell lines. Graphical representation of the likelihood profiles for both parameters in the logistic growth model for all cell lines. Please note here that mumax is the growth rate (r). Maximum likelihood values are at the center of the x-axis, with the y-axis giving the absolute value of the z statistic from the chi-square distribution these profiles approximate. Confidence intervals given at various percentages. Tight confidence intervals tell us when parameters are identifiable.
Cells 12 01849 g0a5
Figure A6. Segmentation evaluation using Countess. Comparing Countess-derived cell counts to cell counts derived from live cell imaging for each cell line. Results do not appear to be confounded by flask type (color-code in legend).
Figure A6. Segmentation evaluation using Countess. Comparing Countess-derived cell counts to cell counts derived from live cell imaging for each cell line. Results do not appear to be confounded by flask type (color-code in legend).
Cells 12 01849 g0a6

References

  1. Marusyk, A.; Janiszewska, M.; Polyak, K. Intratumor Heterogeneity: The Rosetta Stone of Therapy Resistance. Cancer Cell 2020, 37, 471–484. [Google Scholar] [CrossRef]
  2. Ben-David, U.; Amon, A. Context is everything: Aneuploidy in cancer. Nat. Rev. Genet. 2020, 21, 44–62. [Google Scholar] [CrossRef]
  3. Zhu, S.; Qing, T.; Zheng, Y.; Jin, L.; Shi, L. Advances in single-cell RNA sequencing and its applications in cancer research. Oncotarget 2017, 8, 53763–53779. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Hastings, P.; Lupski, J.R.; Rosenberg, S.M.; Ira, G. Mechanisms of change in gene copy number. Nat. Rev. Genet. 2009, 10, 551–564. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Jiang, J.; Wang, D.D.; Yang, M.; Chen, D.; Pang, L.; Guo, S.; Cai, J.; Wery, J.P.; Li, L.; Li, H.Q.; et al. Comprehensive characterization of chemotherapeutic efficacy on metastases in the established gastric neuroendocrine cancer patient derived xenograft model. Oncotarget 2015, 6, 15639–15651. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Li, Y.; Zhang, X.; Gong, J.; Zhang, Q.; Gao, J.; Cao, Y.; Wang, D.D.; Lin, P.P.; Shen, L. Aneuploidy of chromosome 8 in circulating tumor cells correlates with prognosis in patients with advanced gastric cancer. Chin. J. Cancer Res. 2016, 28, 579–588. [Google Scholar] [CrossRef] [Green Version]
  7. Liang, L.; Fang, J.Y.; Xu, J. Gastric cancer and gene copy number variation: Emerging cancer drivers for targeted therapy. Oncogene 2016, 35, 1475–1482. [Google Scholar] [CrossRef]
  8. Giam, M.; Rancati, G. Aneuploidy and chromosomal instability in cancer: A jackpot to chaos. Cell Div. 2015, 10, 3. [Google Scholar] [CrossRef] [Green Version]
  9. Hanahan, D.; Weinberg, R.A. Hallmarks of Cancer: The Next Generation. Cell 2011, 144, 646–674. [Google Scholar] [CrossRef] [Green Version]
  10. Taylor, A.M.; Shih, J.; Ha, G.; Gao, G.F.; Zhang, X.; Berger, A.C.; Schumacher, S.E.; Wang, C.; Hu, H.; Liu, J.; et al. Genomic and Functional Approaches to Understanding Cancer Aneuploidy. Cancer Cell 2018, 33, 676–689. [Google Scholar] [CrossRef] [Green Version]
  11. Shlien, A.; Malkin, D. Copy number variations and cancer. Genome Med. 2009, 1, 62. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Shukla, A.; Nguyen, T.H.M.; Moka, S.B.; Ellis, J.J.; Grady, J.P.; Oey, H.; Cristino, A.S.; Khanna, K.K.; Kroese, D.P.; Krause, L.; et al. Chromosome arm aneuploidies shape tumour evolution and drug response. Nat. Commun. 2020, 11, 449. [Google Scholar] [CrossRef] [Green Version]
  13. Baslan, T.; Kendall, J.; Volyanskyy, K.; McNamara, K.; Cox, H.; D’Italia, S.; Ambrosio, F.; Riggs, M.; Rodgers, L.; Leotta, A.; et al. Novel insights into breast cancer copy number genetic heterogeneity revealed by single-cell genome sequencing. eLife 2020, 9, e51480. [Google Scholar] [CrossRef]
  14. Wu, Y.; Grabsch, H.; Ivanova, T.; Tan, I.B.; Murray, J.; Ooi, C.H.; Wright, A.I.; West, N.P.; Hutchins, G.G.A.; Wu, J.; et al. Comprehensive genomic meta-analysis identifies intra-tumoural stroma as a predictor of survival in patients with gastric cancer. Gut 2013, 62, 1100–1111. [Google Scholar] [CrossRef] [PubMed]
  15. Li, B.; Jiang, Y.; Li, G.; Fisher, G.A.; Li, R. Natural killer cell and stroma abundance are independently prognostic and predict gastric cancer chemotherapy benefit. JCI Insight 2020, 5, e136570. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Mamlouk, S.; Childs, L.H.; Aust, D.; Heim, D.; Melching, F.; Oliveira, C.; Wolf, T.; Durek, P.; Schumacher, D.; Bläker, H.; et al. DNA copy number changes define spatial patterns of heterogeneity in colorectal cancer. Nat. Commun. 2017, 8, 14093. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Chen, G.; Bradford, W.D.; Seidel, C.W.; Li, R. Hsp90 stress potentiates rapid cellular adaptation through induction of aneuploidy. Nature 2012, 482, 246–250. [Google Scholar] [CrossRef] [Green Version]
  18. Lande, R.; Engen, S.; Sæther, B.E. An evolutionary maximum principle for density-dependent population dynamics in a fluctuating environment. Philos. Trans. R. Soc. B Biol. Sci. 2009, 364, 1511–1518. [Google Scholar] [CrossRef] [Green Version]
  19. Caswell, H. Life History Theory and the Equilibrium Status of Populations. Am. Nat. 1982, 120, 317–339. [Google Scholar] [CrossRef]
  20. Li, T.; Liu, J.; Feng, J.; Liu, Z.; Liu, S.; Zhang, M.; Zhang, Y.; Hou, Y.; Wu, D.; Li, C.; et al. Variation in the life history strategy underlies functional diversity of tumors. Natl. Sci. Rev. 2021, 8, nwaa124. [Google Scholar] [CrossRef]
  21. Aktipis, C.A.; Boddy, A.M.; Gatenby, R.A.; Brown, J.S.; Maley, C.C. Life history trade-offs in cancer evolution. Nat. Rev. Cancer 2013, 13, 883–892. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Andor, N.; Lau, B.T.; Catalanotti, C.; Sathe, A.; Kubit, M.; Chen, J.; Blaj, C.; Cherry, A.; Bangs, C.D.; Grimes, S.M.; et al. Joint single cell DNA-seq and RNA-seq of gastric cancer cell lines reveals rules of in vitro evolution. NAR Genom. Bioinform. 2020, 2, lqaa016. [Google Scholar] [CrossRef] [Green Version]
  23. Minussi, D.C.; Nicholson, M.D.; Ye, H.; Davis, A.; Wang, K.; Baker, T.; Tarabichi, M.; Sei, E.; Du, H.; Rabbani, M.; et al. Breast tumours maintain a reservoir of subclonal diversity during expansion. Nature 2021, 592, 302–308. [Google Scholar] [CrossRef] [PubMed]
  24. Tsoularis, A.; Wallace, J. Analysis of logistic growth models. Math. Biosci. 2002, 179, 21–55. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Bergman, R.N.; Cobelli, C. Minimal modeling, partition analysis, and the estimation of insulin sensitivity. Fed. Proc. 1980, 39, 110–115. [Google Scholar]
  26. Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef]
  27. Andor, N.; Simonds, E.F.; Czerwinski, D.K.; Chen, J.; Grimes, S.M.; Wood-Bouwens, C.; Zheng, G.X.Y.; Kubit, M.A.; Greer, S.; Weiss, W.A.; et al. Single-cell RNA-Seq of lymphoma cancers reveals malignant B cell types and co-expression of T cell immune checkpoints. Blood 2018. [Google Scholar] [CrossRef] [Green Version]
  28. Street, K.; Risso, D.; Fletcher, R.B.; Das, D.; Ngai, J.; Yosef, N.; Purdom, E.; Dudoit, S. Slingshot: Cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genom. 2018, 19, 477. [Google Scholar] [CrossRef] [Green Version]
  29. Ben-David, U.; Siranosian, B.; Ha, G.; Tang, H.; Oren, Y.; Hinohara, K.; Strathdee, C.A.; Dempster, J.; Lyons, N.J.; Burns, R.; et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature 2018, 560, 325–330. [Google Scholar] [CrossRef]
  30. Kinsler, G.; Geiler-Samerotte, K.; Petrov, D.A. Fitness variation across subtle environmental perturbations reveals local modularity and global pleiotropy of adaptation. eLife 2020, 9, e61271. [Google Scholar] [CrossRef]
  31. Gerlee, P.; Anderson, A.R.A. The evolution of carrying capacity in constrained and expanding tumour cell populations. Phys. Biol. 2015, 12, 056001. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Aubry, M.; de Tayrac, M.; Etcheverry, A.; Clavreul, A.; Saikali, S.; Menei, P.; Mosser, J. ‘From the core to beyond the margin’: A genomic picture of glioblastoma intratumor heterogeneity. Oncotarget 2015, 6, 12094–12109. [Google Scholar] [CrossRef] [Green Version]
  33. Bastola, S.; Pavlyukov, M.S.; Yamashita, D.; Ghosh, S.; Cho, H.; Kagaya, N.; Zhang, Z.; Minata, M.; Lee, Y.; Sadahiro, H.; et al. Glioma-initiating cells at tumor edge gain signals from tumor core cells to promote their malignancy. Nat. Commun. 2020, 11, 4660. [Google Scholar] [CrossRef] [PubMed]
  34. Gatenby, R.A.; Silva, A.S.; Gillies, R.J.; Frieden, B.R. Adaptive therapy. Cancer Res. 2009, 69, 4894–4903. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Rao, A.; Barkley, D.; França, G.S.; Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 2021, 596, 211–220. [Google Scholar] [CrossRef]
  36. Gatenby, R.A.; Brown, J.S. The Evolution and Ecology of Resistance in Cancer Therapy. Cold Spring Harb. Perspect. Med. 2020, 10, a040972. [Google Scholar] [CrossRef]
  37. Liu, Y.; Mi, Y.; Mueller, T.; Kreibich, S.; Williams, E.G.; Van Drogen, A.; Borel, C.; Frank, M.; Germain, P.L.; Bludau, I.; et al. Multi-omic measurements of heterogeneity in HeLa cells across laboratories. Nat. Biotechnol. 2019, 37, 314–322. [Google Scholar] [CrossRef]
  38. Stringer, C.; Wang, T.; Michaelos, M.; Pachitariu, M. Cellpose: A generalist algorithm for cellular segmentation. Nat. Methods 2021, 18, 100–106. [Google Scholar] [CrossRef]
  39. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv 2015, arXiv:1505.04597. [Google Scholar]
  40. Bakhoum, S.F.; Ngo, B.; Laughney, A.M.; Cavallo, J.A.; Murphy, C.J.; Ly, P.; Shah, P.; Sriram, R.K.; Watkins, T.B.K.; Taunk, N.K.; et al. Chromosomal instability drives metastasis through a cytosolic DNA response. Nature 2018, 553, 467–472. [Google Scholar] [CrossRef] [Green Version]
  41. Kimmel, G.J.; Beck, R.J.; Yu, X.; Veith, T.; Bakhoum, S.; Altrock, P.M.; Andor, N. Intra-tumor heterogeneity, turnover rate and karyotype space shape susceptibility to missegregation-induced extinction. PLoS Comput. Biol. 2023, 19, e1010815. [Google Scholar] [CrossRef]
  42. Aibar, S.; González-Blas, C.B.; Moerman, T.; Huynh-Thu, V.A.; Imrichova, H.; Hulselmans, G.; Rambow, F.; Marine, J.C.; Geurts, P.; Aerts, J.; et al. SCENIC: Single-cell regulatory network inference and clustering. Nat. Methods 2017, 14, 1083–1086. [Google Scholar] [CrossRef] [Green Version]
  43. Gerlee, P. The Model Muddle: In Search of Tumor Growth Laws. Cancer Res. 2013, 73, 2407–2411. [Google Scholar] [CrossRef] [Green Version]
  44. Murphy, H.; Jaafari, H.; Dobrovolny, H.M. Differences in predictions of ODE models of tumor growth: A cautionary example. BMC Cancer 2016, 16, 163. [Google Scholar] [CrossRef] [Green Version]
  45. Heuser, L.; Spratt, J.S.; Polk, H.C., Jr. Growth rates of primary breast cancers. Cancer 1979, 43, 1888–1894. [Google Scholar] [CrossRef]
  46. Voulgarelis, D.; Bulusu, K.C.; Yates, J.W.T. Comparison of classical tumour growth models for patient derived and cell-line derived xenografts using the nonlinear mixed-effects framework. J. Biol. Dyn. 2022, 16, 160–185. [Google Scholar] [CrossRef]
  47. Hall, B.G.; Acar, H.; Nandipati, A.; Barlow, M. Growth Rates Made Easy. Mol. Biol. Evol. 2014, 31, 232–238. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Identifying biomarkers of growth and carrying capacity: (A) Logistic growth curves fit to the cell counts of eight gastric cancer cell lines at various stages of their growth. Fits were shifted along the x-axis such that the midpoint of each curve lay above x = 0 . The eight cell lines differed in their maximum growth rate (r), as well as their maximum sustainable population size (K). (B) Summary statistics of linear regression models used to correlate KEGG pathway activity levels with logistic function growth parameters. The top five models and their performance in the training cell lines are shown for growth rate (r: light gray rows) and carrying capacity (K: dark gray). Columns: Pearson = Pearson correlation coefficient; P = FDR corrected p-value; R 2 = adjusted R 2 . (C,D) Relation between growth parameters ( r , K ) and the model with best performance in the validation dataset: ‘Amoebiasis’ (C) and ‘Arginine and proline metabolism’ (D), respectively. Pathway activity quantified using AUCell with scRNA-seq data. Error bars represent median absolute deviation. (E) Clonal composition confirmed by both scDNA- and scRNA-seq in eight gastric cancer cell lines (data taken from [22]). (F,G) Violin plots showing predicted values for clonal growth rate as a function of ‘Amoebiasis’ activity (F), and clonal carrying capacity as a function of ‘Arginine and proline metabolism’ activity (G). Wilcoxon signed rank test: o p < 0.1, * p < 0.05, ** p < 0.01. If a cell line had more than three significant clone pairs, we displayed only the three highest p-values below 0.1.
Figure 1. Identifying biomarkers of growth and carrying capacity: (A) Logistic growth curves fit to the cell counts of eight gastric cancer cell lines at various stages of their growth. Fits were shifted along the x-axis such that the midpoint of each curve lay above x = 0 . The eight cell lines differed in their maximum growth rate (r), as well as their maximum sustainable population size (K). (B) Summary statistics of linear regression models used to correlate KEGG pathway activity levels with logistic function growth parameters. The top five models and their performance in the training cell lines are shown for growth rate (r: light gray rows) and carrying capacity (K: dark gray). Columns: Pearson = Pearson correlation coefficient; P = FDR corrected p-value; R 2 = adjusted R 2 . (C,D) Relation between growth parameters ( r , K ) and the model with best performance in the validation dataset: ‘Amoebiasis’ (C) and ‘Arginine and proline metabolism’ (D), respectively. Pathway activity quantified using AUCell with scRNA-seq data. Error bars represent median absolute deviation. (E) Clonal composition confirmed by both scDNA- and scRNA-seq in eight gastric cancer cell lines (data taken from [22]). (F,G) Violin plots showing predicted values for clonal growth rate as a function of ‘Amoebiasis’ activity (F), and clonal carrying capacity as a function of ‘Arginine and proline metabolism’ activity (G). Wilcoxon signed rank test: o p < 0.1, * p < 0.05, ** p < 0.01. If a cell line had more than three significant clone pairs, we displayed only the three highest p-values below 0.1.
Cells 12 01849 g001
Figure 2. Using biomarkers of growth and contact inhibition to steer clonal evolution across passages: (A) Clone pairs with significant differences in both growth rates (r) and carrying capacities (K) were detected in two cell lines. Each color encodes for a single pair of clones (legend). Each bar is calculated as the ratio of r or K between the pair of clones shown in the legend. Horizontal line at 1 indicates identical parameters for both clones. Clone pairs with potential r/K trade-offs are represented by bars on both sides of the horizontal line. P-value of difference in r or K between a given pair of clones is indicated on the top of each bar. (B) Heat map showing outcome of clonal competition of the two clones shown in (A) for the SNU-668 cell line after 30 passages when grown at different seeding densities (x-axis) and passaging intervals (y-axis). Positive entries represent K selection ( C l o n e K wins) and negative entries represent r selection ( C l o n e r wins). The magnitude represents the size of the winning clone. Entries calculated as s g n ( C l o n e K C l o n e r ) l o g 10 ( m a x ( C l o n e r , C l o n e K ) ) . (C) Same as in (B), but for the NCI-N87 cell line. Note the presence of both large negative and large positive values for NCI-N87, in contrast to SNU-668, thus indicating the practical relevance of the r/K trade-off for the former. Highlighted in red are the conditions used for simulations in (E). (D) Simulated growth of the three largest clones (color-coded) detected in the NCI-N87 cell line over 40 days (smallest clone excluded due to insufficient G2M cell representation). (E) Change in frequency of the two NCI-N87 clones shown in (A) over multiple passages (x-axis). Changing the seeding density and the timing of splitting the cells between passages (stars vs. circles in legend), we predict will accelerate or delay the decline of clone 119967 (dark color shades), from passage 4 (intersection of star-shaped curves) to after passage 30 (circle-shaped curves). Clonal frequencies at harvest set the initial frequency conditions for each subsequent seeding.
Figure 2. Using biomarkers of growth and contact inhibition to steer clonal evolution across passages: (A) Clone pairs with significant differences in both growth rates (r) and carrying capacities (K) were detected in two cell lines. Each color encodes for a single pair of clones (legend). Each bar is calculated as the ratio of r or K between the pair of clones shown in the legend. Horizontal line at 1 indicates identical parameters for both clones. Clone pairs with potential r/K trade-offs are represented by bars on both sides of the horizontal line. P-value of difference in r or K between a given pair of clones is indicated on the top of each bar. (B) Heat map showing outcome of clonal competition of the two clones shown in (A) for the SNU-668 cell line after 30 passages when grown at different seeding densities (x-axis) and passaging intervals (y-axis). Positive entries represent K selection ( C l o n e K wins) and negative entries represent r selection ( C l o n e r wins). The magnitude represents the size of the winning clone. Entries calculated as s g n ( C l o n e K C l o n e r ) l o g 10 ( m a x ( C l o n e r , C l o n e K ) ) . (C) Same as in (B), but for the NCI-N87 cell line. Note the presence of both large negative and large positive values for NCI-N87, in contrast to SNU-668, thus indicating the practical relevance of the r/K trade-off for the former. Highlighted in red are the conditions used for simulations in (E). (D) Simulated growth of the three largest clones (color-coded) detected in the NCI-N87 cell line over 40 days (smallest clone excluded due to insufficient G2M cell representation). (E) Change in frequency of the two NCI-N87 clones shown in (A) over multiple passages (x-axis). Changing the seeding density and the timing of splitting the cells between passages (stars vs. circles in legend), we predict will accelerate or delay the decline of clone 119967 (dark color shades), from passage 4 (intersection of star-shaped curves) to after passage 30 (circle-shaped curves). Clonal frequencies at harvest set the initial frequency conditions for each subsequent seeding.
Cells 12 01849 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Veith, T.; Schultz, A.; Alahmari, S.; Beck, R.; Johnson, J.; Andor, N. Mathematical Modeling of Clonal Interference by Density-Dependent Selection in Heterogeneous Cancer Cell Lines. Cells 2023, 12, 1849. https://doi.org/10.3390/cells12141849

AMA Style

Veith T, Schultz A, Alahmari S, Beck R, Johnson J, Andor N. Mathematical Modeling of Clonal Interference by Density-Dependent Selection in Heterogeneous Cancer Cell Lines. Cells. 2023; 12(14):1849. https://doi.org/10.3390/cells12141849

Chicago/Turabian Style

Veith, Thomas, Andrew Schultz, Saeed Alahmari, Richard Beck, Joseph Johnson, and Noemi Andor. 2023. "Mathematical Modeling of Clonal Interference by Density-Dependent Selection in Heterogeneous Cancer Cell Lines" Cells 12, no. 14: 1849. https://doi.org/10.3390/cells12141849

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop