*5.4. Mass Spectrometry for Sequence Confirmation*

A purified sample of the recombinant Cystatin-Hv was submitted to in-solution trypsin digestion prior to mass spectrometry analysis by LC-MS/MS. The generated tryptic peptides were desalted, dried, and dissolved in 20 μL of 0.1% (*v*/*v*) formic acid, and 2 μL were automatically injected into a 2 cm C-18 trap column (3 μm particle size, 100 Å pore size, 75 μm I.D., Thermo Fisher Scientific, Waltham, MA, USA) by an Easy nanoLC 1200 coupled to a QExactive plus (Thermo Fisher Scientific, Waltham, MA, USA) mass spectrometer. Chromatographic separation of tryptic peptides was performed on a 15 cm long analytical column (Acclaim PepMap, 2 μm particle size, 100 Å pore size, 50 μm I.D.—Thermo Fisher Scientific, Waltham, MA, USA). Peptides were eluted with a linear gradient of 5–100% Buffer B (80% acetonitrile in 0.1% formic acid) at 200 nL/min for 30 min. The spray voltage was set to 2.4 kV, and the mass spectrometer was operated in positive, data-dependent mode, in which one full MS scan was acquired in the m/z range of 300–1500 followed by MS/MS acquisition using high-energy collisional dissociation (HCD) of the seven most intense ions from the MS scan using an isolation window of 2.0 *m*/*z*.

The obtained MS and MS/MS spectra were analyzed using PEAKS Studio X, and the searches were performed against a customized database. Briefly, the database used included all *Pichia pastoris* protein sequences downloaded from UniProt (a total of 16,348 sequences, downloaded on 14 October 2021) and the translated amino acid sequence of Cystatin-Hv (without the signal peptide sequence). This reference database was concatenated with common contaminants for mass spectrometry experiments (116 sequences), and the decoy sequences were used for false discovery (FDR) rate control. The search engine was set to detect specific tryptic peptides at an FDR of 1%, allowing two missed cleavages. Methionine oxidation, acetylation of the protein N-termini, and deamidation of asparagine and guanidine were set as variable modifications, and carbamidomethylation of cysteine was set as a fixed modification.
