*3.2. GlcA Ring Pucker E*ff*ects*

An initial implementation of the construction algorithm used default force-field geometries for all GalNAc and GlcA rings. The result was that all GalNAc and GlcA rings in an algorithmicallyconstructed conformation had the same internal geometries. Ensembles constructed using this version of the algorithm had longer average end-to-end distances than MD-generated ensembles (Figure S1), which meant that, on average, constructed conformations were overly extended. The default force-field ring pucker geometry for both types of monosaccharides was <sup>4</sup>C1. With that ring pucker, all βGlcA and all but one βGalNAc exocyclic functional groups are equatorial, and therefore the <sup>4</sup>C<sup>1</sup> ring pucker is expected to be strongly preferred to other ring pucker geometries. To validate this simple approach to assigning ring pucker geometry, we computed C-P parameters of each monosaccharide ring in the MD-simulated 20-mer ensemble (10 \* 40,000 = 400,000 ring conformations for each of the two monosaccharide types). As seen in NMR and force-field studies, the stable <sup>4</sup>C<sup>1</sup> chair ring pucker was the principal conformer for both GlcA [50,96–99] and GalNAc [46,96,98,100] in the MD simulations, with slight deformations (0◦ < C-P θ < 30◦ ) (Figure 4a,b). However, a small minority of GlcA ring conformers were skew-boat or boat, namely <sup>3</sup>S1, B1,4, <sup>5</sup>S1, 2,5B, <sup>2</sup>SO, B3,O, <sup>1</sup>S3, 1,4B, and <sup>1</sup>S<sup>5</sup> (60◦ <sup>&</sup>lt; C-P θ < 120◦ ) (Figure 4b). Studies that performed unbiased MD simulations with other force fields observed skew-boat and boat ring puckers of non-sulfated GlcA monosaccharides on the microsecond timescale, but the occurrences were negligible due to high energy barriers [50,98]. In line with those findings, we observed only occasional GlcA skew-boat and boat pucker transitions in chondroitin 20-mers in our 500-ns unbiased CHARMM simulations. However, the C-P φ values in non-4C<sup>1</sup> GlcA conformers in these studies differed from ours. Specifically, one study found <sup>2</sup>SO, B3,O, <sup>1</sup>S3, 1,4B, and <sup>1</sup>S<sup>5</sup> [98]. Slight differences could be explained by differing ion concentrations which likely impacted pyranose ring puckers [101]. However, it is likely that the differences primarily result from intramolecular interactions. The aforementioned literature data come from simulated GlcA monosaccharides only, whereas our results come from simulated chondroitin 20-mers.

‐ ‐ ‐ **Figure 4.** Cremer–Pople data for GalNAc and GlcA in (**a**,**b**) MD-generated ensembles and constructed ensembles (**c**,**d**) before and (**e**,**f**) after energy minimization; geometries from the four sets of each type of ensemble are represented by red, green, blue, and magenta dots, respectively and the force-field geometry is represented by a single large black dot. Cremer–Pople parameters (φ, θ) for all rings in every tenth snapshot from each ensemble were plotted (i.e., 10 rings \* 1,000 snapshots per run \* 4 runs = 40,000 parameter sets). As the algorithm reads all ring conformations sampled in MD, not all datapoints in panels (**c**–**f**) are seen in panels (**a**,**b**) but the full MD-generated dataset contains all datapoints in the constructed ensembles.

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ The MD-generated 20-mer GlcA ring conformations can be separated into two broad categories: those that do not introduce a kink into the polymer and those that do. With the inclusion of <sup>4</sup>C1, the former category encompasses <sup>4</sup>C<sup>1</sup> and <sup>2</sup>S<sup>O</sup> GlcA ring puckers, both of which place the two glycosidic linkage oxygen atoms, located at opposite ends of the ring, in an equatorial conformation (Figure 5a). As such, the O-C<sup>1</sup> and C4-O bond vectors therein are approximately parallel and promote extended polymer conformations. The latter category encompasses <sup>3</sup>S1, B1,4, <sup>5</sup>S1, 2,5B, B3,O, <sup>1</sup>S3, 1,4B, and <sup>1</sup>S<sup>5</sup> GlcA ring puckers (Figure 5b). These ring puckers all place one of these glycosidic linkage oxygen atoms in the equatorial position and the other in the axial position. For these ring puckers, the O-C<sup>1</sup> and C4-O bond vectors are approximately perpendicular, which results in a kink in the polymer chain, and can reduce end-to-end distance even when the remainder of the polymer is fully extended (Figure 5b).

‐ ‐ ‐ ‐ ‐ **Figure 5.** (**a**) 20-mer conformation with a <sup>2</sup>S<sup>O</sup> GlcA conformer (colored by atom type with flanking linkage atoms highlighted in purple) and close-ups of GlcA monosaccharide rings in <sup>4</sup>C<sup>1</sup> and <sup>2</sup>S<sup>O</sup> conformations (shows endocyclic ring atoms and linker oxygen atoms only); (**b**) 20-mer conformation with a kink at a <sup>5</sup>S<sup>1</sup> GlcA conformer (colored by atom type with flanking linkage atoms highlighted in purple) and GlcA monosaccharide rings in <sup>5</sup>S<sup>1</sup> , <sup>3</sup>S<sup>1</sup> , B1,4, 2,5B, B3,O, <sup>1</sup>S<sup>3</sup> , 1,4B, and <sup>1</sup>S<sup>5</sup> conformations (shows endocyclic ring atoms and linker oxygen atoms only); (**c**) 20-mer conformation with a curve caused by flexible glycosidic linkage geometries (highlighted in purple) and all monosaccharides in <sup>4</sup>C<sup>1</sup> conformations; all images came from MD-generated ensembles.

As such, the final version of the construction algorithm uses MD-generated ring conformations instead of default force-field topology geometries. As an added benefit, this approach includes not only MD-generated ring dihedral angles but also bond lengths and angles. Using this finalized version of the algorithm, the peak in the end-to-end distance histogram for the constructed ensemble was shifted left compared to that resulting from force-field topology ring geometries (Figure 6 vs. Figure S1) and much more closely matches the reference MD-generated ensemble data. This finding shows the importance of accounting for ring flexibility in constructing chondroitin glycosaminoglycan polymer conformations similar to those sampled in all-atom explicit-solvent MD simulations. Of note, the radius of gyration was also analyzed as a function of end-to-end distance in MD-generated and constructed ensembles after minimization (Figure S2a,b). These results showed that the radius of gyration is highly correlated with end-to-end distance in both MD-generated and constructed ensembles. ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐

‐

‐ ‐ ‐ ‐ ‐ ‐ **Figure 6.** End-to-end distance probability distribution of MD-generated (blue dashed lines) and constructed (red solid lines) 20-mer ensembles; each type of ensemble includes four sets of 10,000 conformations; probabilities were calculated for end-to-end distances sorted into 0.5 Å bins.

‐ ‐ ‐ ‐ ‐ ‐ ‐ While polymer kinks from ring puckering can lead to the shortening of polymer end-to-end distances, they are not required to achieve this. For 20-mers, flexibility in the glycosidic linkages even with <sup>4</sup>C<sup>1</sup> ring puckering in all constituent monosaccharides can be sufficient to produce compact conformations (Figure 5c). Furthermore, because of the flexibility in the glycosidic linkages flanking non-4C<sup>1</sup> ring puckers, polymer kinks from ring puckering do not always lead to compact conformations. Thus, the leftward shift in the end-to-end distance histogram upon the inclusion of non-4C<sup>1</sup> ring puckers supplements glycosidic linkage flexibility in yielding compact conformations.
