*3.4. ORCA Pipeline*

Code, data files, and supporting documentation on use and workings of the ORCA pipeline are available at https://github.com/c-leber/ORCA, while the parameter sets used for the various analyses reported in this study are available in Tables S6–S8. ORCA was written in Python [81] and is built off the following Python packages: pandas (0.25.2) [82,83], numpy (1.16.5) [84,85], pyteomics (4.1.2) [86,87], scipy (1.3.1) [88], networkx (2.4) [89], matplotlib (3.0.3) [90], sklearn (0.21.3) [91], and seaborn (0.9.0) [92]. ORCA is available in the form of a Jupyter Notebook [93,94], to facilitate customization and interactive experimentation. Prior to analyses in ORCA, proprietary LC-MS datafiles were converted to mzXML using MSCONVERT (https://bio.tools/msconvert) [95], which is a part of the ProteoWizard Library [96]. MSCONVERT was also used to convert proprietary LC-MS/MS datafiles to mzML for the ORCA MS<sup>2</sup> Auxiliary pipeline, and to mzXML or mzML for GNPS.
