*4.5. Filtering Features for Clustering*

Standard Pearson correlation values were calculated between all sequence and structure features (Supplementary Table S2). If two features show a correlation with an absolute value above 0.7, only one was kept. In each case, we discarded the feature that shows a high correlation with a higher number of other features, or the one with the lower standard deviation. In total, none of the seven sequence parameters were discarded, but 13 out of the 24 structure parameters were omitted from subsequent clustering steps.
