**4. Conclusions**

‐ ‐ With (1), all bond, bond angle, and dihedral angle conformational parameters from MD incorporated into the algorithm, (2) monosaccharide rings and glycosidic linkages treated independently, (3) energy minimization performed on each constructed conformation, and (4) a bond potential energy cutoff applied, end-to-end distance probability distributions from constructed and MD-generated

‐ ‐ ‐

ensembles match with minimal differences in most probable end-to-end distances (Table 2 and Figures 6 and 10–12) suggesting that our algorithm produces conformational ensembles that mimic the backbone flexibility seen in MD simulations of non-sulfated chondroitin polymers.

Our program is also valuable for its efficiency. For example, the fully-solvated chondroitin 20-mer system contains ~191,000 atoms and took about one month to simulate. It took only about 12.5 h to construct the 20-mer conformational ensembles using our algorithm, in which all end-to-end distances, radii of gyration, bond lengths, dihedral angles, monosaccharide ring PDB files for C-P analysis of every tenth frame, and bond and system potential energies before and after minimization, C-P parameters of every GlcA ring in every frame, and PDBs of all conformations with bond energies greater than that of the fully-extended 20-mer (including conformations with pierced rings) are written. The fully-solvated chondroitin 10-mer system contains ~36,000 atoms and took about five days to simulate. The 10-mer ensembles were constructed using our algorithm in about 40 min and energies, dihedral angles, end-to-end distances, and radii of gyration after minimization, C-P parameters of every GlcA ring in every frame, and PDBs of all conformations with bond energies greater than that of the fully-extended 10-mer were written. Fully-solvated chondroitin 100-mer and 200-mer systems would contain ~3,370,000 atoms and ~75,450,000 atoms, respectively. Systems of this magnitude are not feasible to simulate with current computational resources but if they could be simulated, they would take on the order of years to complete. Construction of the chondroitin 100-mer and 200-mer conformational ensembles using our algorithm took about four and nine hours, respectively, and the algorithm produced the same output data types as for the 10-mer (n.b.: these timings are less than the 20-mer timing above because of all of the additional output written to disk for analysis in the case of the 20-mer ensemble construction).

In conclusion, our algorithm, incorporating glycosidic linkage and monosaccharide ring conformations from the MD simulation of non-sulfated chondroitin 20-mers, can be used to efficiently generate conformational ensembles of non-sulfated chondroitin polymers of arbitrary length. We are investigating the applicability of this approach to various sulfo-forms of CS and different types of GAGs, including hyaluronan (HA), dermatan sulfate (DS), heparan sulfate (HS), and keratan sulfate (KS). Given the variability and complexity of GAGs, as well as existing barriers to the experimental characterization of the three-dimensional conformational properties of GAGs of lengths relevant in the context of PGs, there are currently very few efforts to target GAGs. We anticipate that the presented algorithm, combined with experimental data on PG core proteins and conformational analysis of the linker tetrasaccharide [48,52,102–104], may provide a useful means of generating atomic-resolution three-dimensional models of full PGs. The algorithm could also be used to model full GAG–protein complexes, which may provide insights into potential interactions between multiple biomolecules within a single GAG complex. The ability to model these complex biomolecules would be a key step towards improving understanding of GAG bioactivity, assessing the druggability of GAGs, designing agonists or antagonists to treat disease, and developing diagnostic tools. Thus, this methodology may open a new avenue into disease modulation.

**Supplementary Materials:** The following are available online http://www.mdpi.com/2218-273X/10/4/537/s1, Table S1: Comparison to Observed Literature Values of Glycosidic Linkage Dihedrals (φ, ψ) in Non-Sulfated Chondroitin, Table S2: Bond Energies of Constructed Chondroitin 20-mer Conformations with Pierced Rings, Figure S1: End-to-end distance probability distribution of 20-mer ensembles generated by MD and an early version of the construction algorithm which applied glycosidic linkage geometries from MD-generated 20-mer ensembles, Figure S2: Scatterplots of radius of gyration as a function of end-to-end distance in MD-generated and constructed chondroitin ensembles, Figure S3: ∆*G(*φ*,*ψ) plots for each glycosidic linkage in the chondroitin 20-mer from MD-generated ensembles, Figure S4: Cremer-Pople plots for each monosaccharide ring in the chondroitin 20-mer from MD-generated ensembles, Figure S5: Probability histograms of bond lengths for each type of bond in the chondroitin 20-mer from MD-generated and constructed ensembles, Figure S6: Probability histogram showing changes in glycosidic linkage φ and ψ dihedral angles during energy minimization in constructed 20-mer ensembles, Figure S7: Cremer-Pople plots of GalNAc and GlcA in MD-generated chondroitin 10-mer ensembles, Figure S8: End-to-end distance probability distribution of chondroitin 20-mer ensembles generated by MD and an early version of the construction algorithm which applied glycosidic linkage geometries from MD-generated non-sulfated chondroitin disaccharide ensembles, Figure S9: ∆*G*(φ,ψ) plots for GlcAβ1-3GalNAc

and GalNAcβ1-4GlcA glycosidic linkages in MD-generated non-sulfated chondroitin disaccharide ensembles and 20-mer ensembles constructed using glycosidic linkage dihedral probabilities from these MD-generated disaccharide ensembles, Figure S10: Bond energy distribution probability histograms from constructed 10-, 20-, 100-, and 200-mer ensembles, Figure S11: Average bond energies as a function of polymer length.

**Author Contributions:** Conceptualization, E.K.W. and O.G.; methodology, E.K.W., G.V., H.S., and O.G.; software, E.K.W., G.V., H.S., and O.G.; validation E.K.W. and O.G.; formal analysis, E.K.W.; investigation E.K.W. and O.G.; resources, O.G.; data curation, E.K.W. and O.G.; writing—original draft preparation, E.K.W.; writing—review and editing, E.K.W., G.V., H.S., and O.G.; visualization, E.K.W.; supervision, O.G.; project administration, O.G.; funding acquisition, O.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research and the APC were funded by the National Science Foundation, grant number MCB-1453529 to O.G.

**Conflicts of Interest:** The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
